1801.04819
Model: gemini-2.0-flash
## Chapter 45
## Robots as powerful allies for the study of embodied cognition from the bottom up
## Matej Hoffmann and Rolf Pfeifer
## Introduction
The study of human cognition- and human intelligence- has a long history and has kept scientists from various disciplines- philosophy, psychology, linguistics, neuroscience, artificial intelligence, and robotics- busy for many years. While there is no agreement on its definition, there is wide consensus that it is a highly complex subject matter that will require, depending on the particular position or stance, a multiplicity of methods for its investigation. Whereas, for example, psychology and neuroscience favor empirical studies on humans, artificial intelligence has proposed computational approaches, viewing cognition as information processing, as algorithms over representations. Over the last few decades, overwhelming evidence has been accumulated showing that the pure computational view is severely limited and that it must be extended to incorporate embodiment, i.e., the agent's somatic setup and its interaction with the real world, and, because they are real physical systems, robots became the tools of choice to study cognition. There have been a plethora of pertinent studies, but they all have their own intrinsic limitations. In this chapter, we demonstrate that a robotic approach, combined with information theory and a developmental perspective, promises insights into the nature of cognition that would be hard to obtain otherwise.
We start by introducing 'low- level' behaviors that function without control in the traditional sense; we then move to sensorimotor processes that incorporate reflex- based loops (involving neural processing). We discuss 'minimal cognition' and show how the role of embodiment can be quantified using information theory, and we introduce the
so- called SMCs, or sensorimotor contingencies, which can be viewed as the very basic building blocks of cognition. Finally, we expand on how humanoid robots can be productively exploited to make inroads in the study of human cognition.
## Behavior Through Interaction
What cognitive scientists are regularly forgetting is that complex coordinated behaviors- for example, walking, running over uneven terrain, swimming, avoiding obstacles- can often be realized with no or minimal involvement of cognition/ representation/ computation. This is possible because of the properties of the body and the interaction with the environment, that is, the embodied and embedded nature of the agent. Robotics is well suited for providing existence proofs of this kind and then to further analyze these phenomena. We will only briefly present some of the most notable case studies.
## Low- Level Behavior: Mechanical Feedback Loops
A classical illustration of behavior in complete absence of a 'brain' is the passive dynamic walker (McGeer 1990): a minimal robot that can walk without any sensors, motors, or control electronics. It loosely resembles a human, with two legs, no torso, and two arms attached to the 'hips, ' but its ability to walk is exclusively due to the downward slope of the incline on which it walks and the mechanical parameters of the walker (mainly leg segment lengths, mass distribution, foot shape, and frictional characteristics). The walking movement is entirely the result of finely tuned mechanics on the right kind of surface. A motivation for this research is also to show how human walking is possible with minimal energy use and only limited central control. However, most of the problems that animals or robots are faced with in the real world cannot be solved solely by passive interaction of the physical body with the environment. Typically, active involvement by means of muscles/ motors is required. Furthermore, the actuation pattern needs to be specified by the agent, 1 and hence a controller of some sort is required. Nevertheless, it turns out that if the physical interaction of the body with the environment is exploited, the control program can be very simple. For example, the passive dynamic walker can be modified by adding a couple of actuators and sensors and a reflex- based controller, resulting in the expansion of its niche to level ground while keeping the control effort and energy expenditure to a minimum (Collins et al. 2005).
However, in the real world, the ground is often not level and frequent corrective action needs to be taken. It turns out that often the very same mechanical system can
1 In this chapter, we will use 'agent' to describe humans, animals, or robots.
generate this corrective response. This phenomenon is known as self- stabilization and is a result of a mechanical feedback loop. To use dynamical systems terminology, certain trajectories (such as walking with a particular gait) have attracting properties and small perturbations are automatically corrected, without control- or one could say that 'control' is inherent in the mechanical system. 2 Blickhan et al. (2007) review self- stabilizing properties of biological muscles in a paper entitled 'Intelligence by Mechanics'; Koditschek et al. (2004) analyze walking insects and derive inspiration for the design of a hexapod robot with unprecedented mobility (RHex- e.g., Saranli et al. 2001).
## Sensorimotor Intelligence
Mechanical feedback loops constitute the most basic illustration of the contribution of embodiment and embeddedness to behavior. The immediate next level can be probably attributed to direct, reflex- like, sensorimotor loops. Again, robots can serve to study the mechanisms of 'reactive' intelligence. Grey Walter (Walter 1953), the pioneer of this approach, built electronic machines with a minimal 'brain' that displayed phototacticlike behavior. This was picked up by Valentino Braitenberg (Braitenberg 1986) who designed a whole series of two- wheeled vehicles of increasing complexity. Even the most primitive ones, in which sensors are directly connected to motors (exciting or inhibiting them), display sophisticated behaviors. Although the driving mechanisms are simple and entirely deterministic, the interaction with the real world, which brings in noise, gives rise to complex behavioral patterns that are hard to predict.
This line was picked up by Rodney Brooks, who added an explicit anti- representationalist perspective in response to the in- the- meantime- firmly- established cognitivistic paradigm (e.g., Fodor 1975; Pylyshyn 1984) and 'good old- fashioned artificial intelligence' (GOFAI) (Haugeland 1985). Brooks openly attacked the GOFAI position in the seminal articles 'Intelligence without Reason' (Brooks 1991a) and 'Intelligence without Representation' (Brooks 1991b), and proposed behavior- based robotics instead. Through building robots that interact with the real world, such as insect robots (Brooks 1989), he realized that 'when we examine very simple level intelligence we find that explicit representations and models of the world simply get in the way. It turns out to be better to use the world as its own model' (Brooks 1991b). Inspired by biological evolution, Brooks created a decentralized control architecture consisting of different layers; every layer is a more or less simple coupling of sensors to motors. The levels operate in parallel but are built in a hierarchy (hence the term subsumption architecture ; Brooks 1986). The individual modules in the architecture may have internal states (the agents are thus not purely reactive any more); however, Brooks argued against calling the internal states representations (Brooks 1991b).
2 The description is idealized- in reality, a walking machine would fall into the category of 'hybrid dynamical systems,' where the notions of attractivity and stability are more complicated.
## Minimal Embodied Cognition
In the case studies described in the previous section, the agents were either mere physical machines or they relied on simple direct sensorimotor loops only- resembling reflex arcs of the biological realm. They were reactive agents constrained to the 'hereand- now' time scale, with no capacity for learning from experience and also no possibility of predicting the future course of events. Although remarkable behaviors were sometimes demonstrated, there are intrinsic limitations.
The introduction of first instances of internal simulation, which goes beyond the 'here- and- now' time scale, is considered the hallmark of cognition by some (e.g., Clark and Grush 1999). This could be a simple forward model (as present already in insectssee Webb 2004) that provides the prediction of a future sensory state given the current state and a motor command (efference copy). Forward models could provide a possible explanation of the evolutionary origin of first simulation/ emulation circuitry 3 and of environmentally decoupled thought- the agent employing primitive 'models' before or instead of directly operating on the world.
Early emulating agents would then constitute the most minimal case of what Dennett calls a Popperian creature- a creature capable of some degree of off- line reasoning and hence able (in Karl Popper's memorable phrase) to 'let its hypotheses die in its stead' (Dennett 1995, p. 375). (Clark and Grush 1999, p. 7)
Importantly, we are still far from any abstract models or symbolic reasoning. Instead, we are dealing with the sensorimotor space and the possibility for the agent to extract regularities in it and later exploit this experience in accordance with its goals. For example, the agent can learn that given a certain visual stimulation, say, from a cup, a particular motor action (reach and grasp) will lead to a pattern of sensory stimulation (in humans: we can feel the cup in the hand). The sensorimotor space plays a key part here and it is critically shaped by the embodiment of the agent and its embedding in the environment: a specific motor signal only leads to a distinct result if embedded in the proper physical setup. If you change the shape and muscles of the arm, the motor signal will not result in a successful grasp.
## Quantifying the Effect of Embodiment Using Information Theory
For cognitive development of an agent, the 'quality' of the sensorimotor space determines what can be learned. First, the type of sensory receptors- their mechanism
3 See Grush (2004) for the similarities and differences between emulation theory (Grush 2004) and simulation theory (Jeannerod 2001).
of transduction- determines what kind of signals the agent's brain or controller will be receiving from the environment. Furthermore, the shape and placement of these sensors will perform an additional transformation of the information that is available in the environment.
For example, different species of insects have evolved different non- homogeneous arrangements of the light- sensitive cells in their eyes, providing an advantageous nonlinear transformation of the input for a particular task. One example is exploiting egomotion together with motion parallax to gauge distance to objects in the environment and eventually facilitate obstacle avoidance. Using a robot modeled after the facet eye of a housefly, Franceschini et al. (1992) showed that the nonlinear arrangement of the facets- more dense in the front than on the side- compensates for the motion parallax and allows uniform motion detection circuitry to be used in the entire eye, which makes it easy for the robot to avoid obstacles with little computation. These findings were confirmed in experiments with artificial evolution on real robots (Lichtensteiger 2004). Artificial eyes with designs inspired by arthropods include Song et al. (2013) and Floreano et al. (2013).
It is not always possible to pinpoint the specific transformation of sensory signals that is facilitated by the morphology as in the previous case. A more general tool is provided by the methods of information theory. Information is used in the Shannon sense here- to quantify statistical patterns in observed variables. The structure or amount of information induced by particular sensor morphology could be captured by different measures, for example, entropy. However, information (structure) in the sensory variables tells only half of the story (a 'passive perception' one in this case), because organisms interact with their environments in a closed- loop fashion: sensory inputs are transformed into motor outputs, which in turn determine what is sensed next. Therefore, the 'raw material' for cognition is constituted by the sensorimotor variables and it is thus crucial to study relationships between sensors and motors, as illustrated by the sensorimotor contingencies (see next section). Furthermore, time is no less important a variable. Lungarella and Sporns (2006) provide an excellent example of the use of information theoretic measures in this context. In a series of experiments with a movable camera system, they could show that, for example, the entropy in the visual field is decreased if the camera is tracking a moving visual target (a red ball) compared to the condition where the movement of the ball and the camera were uncorrelated. This is intuitively plausible, because if the object is kept in the center of the visual field, there is more 'order, ' i.e., less entropy. A collection of case studies on informationtheoretic implications of embodiment in locomotion, grasping, and visual perception is presented by Hoffmann and Pfeifer (2011).
## Sensorimotor Contingencies
Sensorimotor contingencies (SMCs) were originally presented in the influential article by O'Regan and Noë (2001) as the structure of the rules governing sensory changes produced by various motor actions. The SMCs, according to O'Regan and Noë, are the
key 'raw material' upon which perception, cognition, and eventually consciousness operates. Furthermore, they sketch a possible hierarchy ranging from modality- related (or apparatus- related) SMCs to object- related SMCs. The former, the modality- related SMCs, would capture the immediate effect that certain actions (or movements) have on sensory stimulation. Clearly, these would be sensory modality specific (e.g., head movement will induce a different change in the SMCs of the visual and auditory modalities- turning the head will change the visual stimulation almost entirely, whereas changes in the acoustic system will be minimal) and would strongly depend on the sensory morphology. Therefore, this concept is strongly related to what we have discussed in the previous sections: (1) different sensory morphology importantly affects the information flow induced in the sensory receptors and hence also the corresponding SMCs; (2) the effect of action is already constitutively included in the SMC notion itself.
Although conceptually very powerful, the notion of SMCs was not articulated concretely enough in O'Regan and Noë (2001) such that it could be expressed mathematically or directly transferred into a robot implementation, for example. Bührmann et al. (2013) have proposed a formal dynamical systems account of SMCs. They devised a dynamical system description for the environment and the agent, which is in turn split into body, internal state (such as neural activity), motor, and sensory dynamics. Bührmann et al. are making a distinction between sensorimotor (SM) environment, SM habitat, SM coordination, and SM strategy. The SM environment is the relation between motor actions and changes in sensory states, independent of the agent's internal (neural) dynamics. The other notions- from SM habitat to SM strategies- add internal dynamics to the picture. SM habitat refers to trajectories in the sensorimotor space, but subject to constraints given by the internal dynamics that are responsible for generating motor commands, which may depend on previous sensory states as well- an example of closed- loop control. SM coordination then further reduces the set of possible SM trajectories to those 'that contribute functionally to a task. ' For example, specific patterns of squeezing an object in order to assess its hardness would be SM coordination patterns serving object discrimination. Finally, SM strategies take, in addition, 'reward' or 'value' for the agent into account.
As wonderfully illustrated by Beer and Williams (2015), the dynamical systems and information theory are two complementary mathematical lenses through which brainbody-environment systems can be studied. While acknowledging the merits of both frameworks as 'intuition, theory, and experimental pumps' (Beer and Williams 2015), it is probably fair to say that compared to dynamical systems, information theory has been thus far more successfully applied to the analysis of real systems of higher dimensionality. This is true for both natural systems- in particular, brains (Garofalo et al. 2009; Quiroga and Panzeri 2009)- and artificial systems. Thus, to study sensorimotor contingencies in a real robot beyond the simple simulated agents of Bührmann et al. (2013) and Beer and Williams (2015), we chose to use the lens of information theory. Following up on related studies of e.g., Olsson et al. (2004), we conducted a series of studies in a real quadrupedal robot with rich nonlinear dynamics and a collection of sensors from different modalities (Hoffmann et al. 2012; Hoffmann et al. 2014; Schmidt et al. 2013) (see Box 45.1). We have applied the notion of 'transfer entropy'
## Box 45.1 Sensorimotor contingencies in a quadruped robot
Figure 45.1. Robot 'Puppy' and sensorimotor contingencies.
<details>
<summary>Image 1 Details</summary>

### Visual Description
## Sensorimotor Contingencies in a Quadruped Robot
### Overview
The image presents an analysis of sensorimotor contingencies in a quadruped robot. It includes a photograph of the robot with labeled components, diagrams illustrating information flow between motor, hip, and foot, and a bar chart comparing classification accuracy based on different sensory and action data combinations.
### Components/Axes
* **(a) Robot Photograph:** Shows the quadruped robot with labels indicating the locations of:
* 4x motors (top)
* 4x hip encoders (top-left)
* 4x knee encoders (bottom)
* 4x pressure sensors (right)
* **(b-f) Information Flow Diagrams:** These diagrams depict the flow of information between the motor, hip, and foot. The thickness and shade of the lines represent the strength of the connection in "bits". Each diagram has a color scale below it, ranging from light gray (0 bits) to dark gray (0.45 or 0.51 bits, depending on the diagram).
* The x-axis of the color scale is labeled with values: 0, 0.045, 0.09, 0.14, 0.18, 0.23, 0.27, 0.32, 0.36, 0.4, 0.45 [bits] for diagram (b).
* The x-axis of the color scale is labeled with values: 0.35, 0.37, 0.38, 0.4, 0.41, 0.43, 0.45, 0.46, 0.48, 0.49, 0.51 [bits] for diagrams (c), (d), (e), and (f).
* **(g) Classification Accuracy Bar Chart:** This chart compares the classification accuracy (%) for different data combinations.
* **Y-axis:** Lists the following categories:
* sensory data only
* sensory data + action
* sensory data + action from 2 epochs
* sensory data + action from 3 epochs
* **X-axis:** Classification accuracy (%), ranging from 40 to 100 in increments of 10.
* **Legend:** Located at the bottom-right of the chart, indicating:
* Blue bars represent "Mean" accuracy.
* Red bars represent "Best" accuracy.
### Detailed Analysis
* **Information Flow Diagrams (b-f):** These diagrams show the strength of connections between the motor, hip, and foot. The thickness and darkness of the lines indicate the amount of information flow.
* Diagram (b) has a lighter overall shade compared to diagrams (c), (d), (e), and (f), indicating lower information flow.
* Diagrams (c), (d), (e), and (f) show varying patterns of information flow, with some connections being stronger (darker and thicker) than others.
* **Classification Accuracy Bar Chart (g):**
* **Sensory data only:** Mean accuracy is approximately 70%, Best accuracy is approximately 70%.
* **Sensory data + action:** Mean accuracy is approximately 80%, Best accuracy is approximately 80%.
* **Sensory data + action from 2 epochs:** Mean accuracy is approximately 85%, Best accuracy is approximately 95%.
* **Sensory data + action from 3 epochs:** Mean accuracy is approximately 85%, Best accuracy is approximately 100%.
### Key Observations
* The robot is equipped with multiple sensors and actuators to gather data and perform actions.
* The information flow diagrams show the relationships between the motor, hip, and foot, with varying strengths of connections.
* The classification accuracy improves when action data is combined with sensory data.
* Increasing the number of epochs (2 or 3) further improves the best classification accuracy.
### Interpretation
The data suggests that incorporating action data into the analysis significantly improves the classification accuracy of the robot's movements. This indicates that the robot's actions are closely linked to its sensory inputs, and that understanding this relationship is crucial for accurate movement prediction and control. The increase in accuracy with more epochs suggests that the robot learns and adapts its movements over time, further refining its sensorimotor coordination. The diagrams (b-f) provide a visual representation of the information flow, highlighting the key connections between the motor, hip, and foot.
</details>
Experiments were conducted on the quadrupedal robot Puppy (Figure 45.1a), which has four servomotors in the hips together with encoders measuring the angle at the joint, four encoders in the passive compliant knees, and four pressure sensors on the feet. We used the notion of 'transfer entropy' from information theory, which can be used to measure
directed information flows between time series. In our case, the time series were collected from individual motor and sensory channels and the information transfer was calculated for every pair of channels two times, once in every direction (say, from hind right motor to front right knee encoder and also in the opposite direction). Loosely speaking, transfer entropy from channel A to channel B measures how well the future state of channel B can be predicted knowing the current state of channel A (see Schmidt et al. 2013 for details).
First, we wanted to investigate the 'sensorimotor structure, ' i.e., the relative strengths of relationships between different sensors and motors, which is intrinsic to the robot's embodiment (body + sensor morphology only). To this end, random motor commands were applied and the relationships between motor and sensory variables were studied, closely resembling the notion of SM environment (Bührmann et al. 2013). The strongest information flows between pairs of channels were extracted and are shown overlaid over the schematic of the Puppy robot (dashed lines) in panel B. The transfer entropy is encoded as thickness and gray level of the arrows. The strongest flow occurs from the motor signals to their respective hip joint angles, which is clear because the motors directly drive the respective hip joints. The motors have a smaller influence on the knee angles (stronger in the hind legs) and on the feet pressure sensors- on the respective legs where the motor is mounted, thus illustrating that body topology was successfully extracted (at the same time, the flows from the hind leg motors and hips to the front knees highlight that the functional relationships are different than the static body structure; see also Schatz and Oudeyer 2009). These patterns are analogous to the modality- related SMCs; just as we can predict what will be the sensory changes induced by moving the head, the robot can predict the effects of moving the hind leg, say.
In a second step, we studied the relationships in the sensorimotor space when the robot was running with specific coordinated periodic movement patterns or gaits. The results for two selected gaits- turn left and bound right * - are shown in panels C and D, respectively. The flows from motors to the hip joints, which would again dominate, were left out of the visualization. The plots clearly demonstrate the important effect of specific action patterns in two ways. First, they markedly differ from the random motor command situation: the dominant flows are different and, in addition, the magnitude of the information flows is bigger (the number of bits- note the different range of the color bar compared to B), illustrating how much information structure is induced by the 'neural pattern generator. ' Second, they also significantly differ between themselves. The 'turn left' gait in panel C reveals the dominant action of the right leg and in particular the knee joint. In the 'bound right' gait in D, the motor signals are predictive of the sensory stimulation in the hind knees and also the left foot. The gaits were obtained by optimizing the robot's performance for speed or for turning and thus correspond to patterns that are functionally relevant for the robot and can even be said to carry 'value. ' Thus, in the perspective of Bührmann et al. (2013), our findings about the sensorimotor space using the gaits can be interpreted as studying the SM coordination or even SM strategy of the quadruped robot.
Finally, next to the embodiment or morphology (shape of the body and limbs, type and placement of sensors and effectors, etc.) and the brain (the neural dynamics responsible for generating the coordinated motor command sequences), the SMCs are co-determined by the environment as well. All the results thus far came from sensorimotor data collected from the robot running on a plastic foil ground (low friction). Panels E and F depict how the information flows for the bound right gait are modulated when the robot runs on a different ground (E- Styrofoam, F- rubber). The overall pattern is similar to D, but the flows to the left foot disappear, and eventually flows to the left knee joint become dominant. This
is because the posture of the robot changed: the left foot contacts the ground at a different angle now, inducing less stimulation in the pressure sensor. Also, as the friction increases (from the foil over Styrofoam to rubber), the push- off during stance of the left hind leg becomes stronger, resulting in more pronounced bending of the knee. Finally, since the high- friction ground poses more resistance to the robot's movements, the trajectories are less smooth and the overall information flow drops.
While all the components (body, brain, environment) have a profound effect on the overall sensorimotor space, our analysis reveals that in this case, the gait used (as prescribed primarily by the 'neural/ brain' dynamics) is a more important factor than the environment (the ground)- the latter seems to modulate the basic structure of information flows induced by the gait. This has important consequences for the agent when it is to learn something about its environment and perform perceptual categorization, for example. In order to investigate this quantitatively, we have presented the robot with a terrain (the surface/ ground it was running on) classification task. Relying on sensory information alone leads to significantly worse terrain classification results than when the gait is explicitly taken into account in the classification process (Hoffmann, Stepanova, and Reinstein 2014). Furthermore, in line with the predictions of the sensorimotor contingency theory, longer sensorimotor sequences are necessary for object perception (Maye and Engel 2012). That is, while in short sequences (motor command, sensory consequence), modality- related SMCs (panel B) will be dominant, longer interactions will allow objects the agent is interacting with to stand out. Using data from our robot, this is convincingly demonstrated in panel G. The first row shows classification results when using data from one sensory epoch (two seconds of locomotion) collapsed across all gaits, i.e., without the action context. Subsequent rows report results where classification was performed separately for each gait and increasingly longer interaction histories were available. 'Mean' values represent the mean performance; 'best' are classification results from the gait that facilitated perception the most (see Hoffmann et al. 2012 for details).
* 'Turn left' was a movement pattern dominated by the action of the right hind leg that was pushing the robot forward and left. Regarding 'bound right, ' bounding gait is a running gait used by small mammals. It is similar to gallop, and features a flight phase, but is characterized by synchronous action of every pair of legs. However, in this study, we used lower speeds without an aerial phase. In addition, the symmetry of the motor signals was slightly disrupted, resulting in a right- turning motion.
from information theory, which can be used to characterize sensorimotor flows in the robot- for example, how strongly sensors are affected by motor commands- and we tried to isolate the effects of the body, motor programs (gaits), and environment in the agent's sensorimotor space. Finally, we tested the predictions of SMC theory regarding object discrimination. In our investigations, we have chosen the situated perspectiveanalyzing only the relationships between sensory and motor variables that would also be available to the agent itself. However, information- theoretic methods can also be productively applied to study relationships between internal and external variables, such as between sensory or neuronal states and some properties of an external object (e.g., its size, Beer and Williams 2015; or any other property that can be expressed numerically). Using this approach, one can obtain important insights into the operation and temporal
evolution of categorization, for example. Performing this in the ground discrimination scenario on the quadrupedal robot constitutes our future work.
While the studies on 'minimally cognitive agents' are of fundamental importance and lead to valuable insights for our understanding of intelligent behavior, the ultimate target is, of course, human cognition. Toward this end, one may want to resort to more sophisticated tools, for example, humanoid robots.
## Human- like Cognition in Robots
In the previous section, we showed how robots can be beneficial in operationalizing, formalizing, and quantifying ideas, concepts, and theories that are important for understanding cognition but that are often not articulated in sufficient detail. An obvious implication of this analysis is that the kind of cognition that emerges will be highly dependent on the body of the agent, its sensorimotor apparatus, and the environment it is interacting with. Thus, to target human cognition, the robot's morphology- shape, type of sensors, and their distribution, materials, actuators- should resemble that of humans as closely as possible. Now we have to be realistic: approximating humans very closely would imply mimicking their physiology, the sensors in the body, and the inner organs, the muscles with comparable biological instantiation, and the bloodstream that supplies the body with energy and oxygen. Only then could the robot experience the true concept, e.g., of being thirsty or out of breath, hearing the heart pumping, blushing, or the feeling of quenching the thirst while drinking a cold beer in the summer. So, even if, on the surface, a robot might be almost indistinguishable from a human (like, for example, Hiroshi Ishiguro's recent humanoid 'Erica'), we have to be aware of the fundamental differences: comparatively very few muscles and tendons, no actuators that can get sore when overused, no sensors for pain, only low- density haptic sensors, no sweat glands in the skin, and so on and so forth. Thus, 'Erica' will have a very impoverished concept of drinking or feeling hot. In other words, we have to make substantial abstractions.
Just as an aside, making abstractions is nothing bad- in fact, it is one of the most crucial ingredients of any scientific explanation because it forces us to focus on the essentials, ignoring whatever is considered irrelevant (the latter most likely being the majority of things that we could potentially take into account). Thus, the specifics of the robot's cognition- its concepts, its body schema- will clearly diverge from that of humans, but the underlying principles will, at a certain level of abstraction, be the same. For example, it will have its own sensorimotor contingencies, it will form cross- modal associations through Hebbian learning, and it will explore its environment using its sensorimotor setup. So if the robot says 'glass, ' this will relate to very different specific sensorimotor experiences, but if the robot can recognize, fill, and hand a 'glass' to a human for drinking, it makes sense to say that the robot has acquired the concept of 'glass. '
Because the acquisition of concepts is based on sensorimotor contingencies, which in turn require actions on the part of the agent, and because the patterns of sensory stimulation are associated with the respective motor signals, the robot platforms of choice will ideally be tendon- driven- just like humans who use muscles and tendons for
movements. Given our discussion on abstraction earlier, we can also study concept acquisition in robots that have motors in the joints- we just have to be aware of the concrete differences. Still, the principles governing the robot's cognition can be very similar to that of humans (see Box 45.2 for examples of different types of humanoid robots).
## BOX 45.2 Humanoid embodiment for modeling cognition
<details>
<summary>Image 2 Details</summary>

### Visual Description
## Photograph: Robot Interaction
### Overview
The image is a photograph showing two women interacting with a humanoid robot. The robot is positioned in the center, with a woman on either side. The women appear to be playfully interacting with the robot, possibly attempting to "kiss" it. The background features an abstract painting. The image is labeled with "(a)" in the top-left corner.
### Components/Axes
* **People:** Two women, one on each side of the robot.
* **Robot:** A humanoid robot with a white head and visible internal mechanisms.
* **Background:** An abstract painting with white, red, and gray colors.
* **Label:** "(a)" in the top-left corner.
### Detailed Analysis or ### Content Details
* **Women:** The woman on the left has short dark hair and is wearing a black jacket. The woman on the right has long dark hair and is also wearing a black jacket. Both women are leaning towards the robot's head with their lips pursed.
* **Robot:** The robot has a white, egg-shaped head with several circular indentations. Its body is partially transparent, revealing internal wiring and mechanical components. The robot's arms are extended.
* **Background:** The painting in the background is abstract and features broad strokes of white, red, and gray paint.
* **Label:** The label "(a)" is located in the top-left corner of the image, likely indicating that this is part of a series of images or figures.
### Key Observations
* The image captures a playful interaction between humans and a robot.
* The robot's design is somewhat futuristic and exposes its internal components.
* The abstract painting in the background adds a layer of artistic context.
### Interpretation
The image likely explores the theme of human-robot interaction and the evolving relationship between humans and technology. The playful interaction suggests a sense of curiosity and perhaps even affection towards the robot. The robot's exposed internal components could symbolize transparency or the demystification of technology. The abstract background adds an artistic element, suggesting that the image is intended to be more than just a straightforward depiction of human-robot interaction. It could be interpreted as a commentary on the integration of technology into human life and the blurring of boundaries between the natural and the artificial.
</details>
<details>
<summary>Image 3 Details</summary>

### Visual Description
## Photograph: Humanoid Robot Stretching
### Overview
The image shows a humanoid robot performing a stretching exercise. The robot is silver in color with visible internal wiring and components. It is wearing white sneakers and is positioned against a green background. The robot's posture suggests it is bending to the side with one arm raised overhead and the other extended to the side.
### Components/Axes
* **Robot:** Silver humanoid robot with visible internal components.
* **Clothing:** White sneakers.
* **Background:** Solid green color.
* **Annotation:** The letter "(c)" is visible in the top-left corner of the image.
### Detailed Analysis
The robot is depicted in a dynamic pose, suggesting movement or flexibility. Its internal structure is exposed, revealing the complexity of its design. The white sneakers provide a sense of scale and grounding. The green background is uniform and does not provide additional context.
### Key Observations
* The robot is designed to mimic human-like movements.
* The exposed internal components highlight the engineering aspect of the robot.
* The stretching pose suggests a focus on flexibility and range of motion.
### Interpretation
The image likely showcases the capabilities of the humanoid robot in terms of movement and flexibility. The exposed internal components may be intended to demonstrate the complexity of the robot's design. The stretching pose could indicate that the robot is being used for research or development in areas such as robotics, biomechanics, or human-robot interaction. The annotation "(c)" likely indicates a copyright notice.
</details>
<details>
<summary>Image 4 Details</summary>

### Visual Description
## Photograph: Robot Torso
### Overview
The image is a close-up photograph of the internal mechanisms of a robot torso. The robot's "skeleton" is visible, along with various electronic components, wires, and supporting structures. Several logos and labels are visible on the robot's body.
### Components/Axes
* **Robot Torso:** The main subject of the image, showing the internal structure and components.
* **Electronic Components:** Various circuit boards, wires, and other electronic parts are visible.
* **Logos/Labels:** Several logos and labels are present on the robot's body, including "ai lab", "Vormind", "robotics", "prototype", and others.
* **(b):** Label in the top-left corner.
### Detailed Analysis or ### Content Details
The photograph shows a complex arrangement of components within the robot's torso. The "skeleton" appears to be made of a white material, possibly plastic or a composite. Wires and cables are routed throughout the structure, connecting the various electronic components. The logos and labels suggest the organizations or companies involved in the robot's development or construction.
The logos visible include:
* "ai lab" (top center)
* "Vormind" (top center)
* "robotics" (top right)
* "prototype" (bottom right)
* Other logos are present but not clearly legible.
### Key Observations
* The robot's internal structure is highly complex and densely packed with components.
* The presence of multiple logos suggests a collaborative effort in the robot's development.
* The overall impression is one of advanced technology and intricate engineering.
### Interpretation
The photograph provides a glimpse into the inner workings of a sophisticated robot. The complexity of the internal structure highlights the challenges involved in designing and building such machines. The presence of multiple logos suggests that the robot is the result of a collaborative effort between different organizations or companies. The image conveys a sense of technological advancement and the potential for robots to perform complex tasks.
</details>
<details>
<summary>Image 5 Details</summary>

### Visual Description
## Diagram: Baby with Spots
### Overview
The image shows a 3D rendered model of a baby covered in red spots. The baby is in a seated position with its hands near its face. The background is a solid light yellow color. The image is labeled with "(d)" in the top-left corner.
### Components/Axes
* **Baby Model:** A 3D rendering of a baby in a seated position.
* **Red Spots:** Numerous small red spots are distributed across the baby's entire body surface.
* **Background:** A solid light yellow color.
* **Label:** "(d)" in the top-left corner.
### Detailed Analysis
The baby model is rendered with a smooth, pale skin tone. The red spots are evenly distributed across the body, including the face, torso, limbs, hands, and feet. The spots appear to be slightly raised or textured. The baby's hands are positioned near its face, with the fingers slightly curled. The baby is seated with its legs spread apart.
### Key Observations
* The even distribution of the red spots across the baby's body suggests a systemic condition or rash.
* The 3D rendering provides a clear visual representation of the extent and distribution of the spots.
### Interpretation
The image likely depicts a medical condition characterized by a rash or spots on the skin of an infant. The even distribution of the spots suggests a systemic issue rather than a localized irritation. The image could be used for educational or diagnostic purposes to illustrate the appearance of a particular skin condition in infants. The label "(d)" suggests that this image is part of a series or sequence of images related to the same topic.
</details>
<details>
<summary>Image 6 Details</summary>

### Visual Description
## Photograph: iCub Robot
### Overview
The image shows a close-up of an iCub robot, a humanoid robot designed for research in embodied artificial intelligence. The robot is positioned in a way that suggests it is interacting with something, with its hands slightly outstretched. The robot has a human-like face with red LED lights illuminating its eyebrows and mouth.
### Components/Axes
* **Robot Body:** The robot has a black torso and silver-colored limbs.
* **Head:** The head is cream-colored with black headphones. The face has red LED lights for eyebrows and a mouth.
* **Hands:** The robot has articulated fingers.
* **Background:** The background is a neutral gray color.
* **Label:** The top-left corner of the image contains the label "(e)".
### Detailed Analysis or ### Content Details
The iCub robot is shown from the chest up. The robot's head is slightly tilted to the left. The robot's arms are bent at the elbows, with the hands positioned in front of the torso. The fingers are slightly spread apart. The red LED lights on the face give the robot an expressive appearance.
### Key Observations
* The robot's design is intended to mimic a human child.
* The red LED lights on the face are a key feature of the robot's design, allowing it to express emotions.
* The robot's articulated fingers allow it to manipulate objects.
### Interpretation
The image showcases the iCub robot, a sophisticated platform for research in embodied AI. The robot's human-like design and expressive features make it a valuable tool for studying human-robot interaction. The robot's ability to manipulate objects allows it to perform tasks in the real world. The image highlights the potential of robots to interact with humans in a natural and intuitive way.
</details>
Figure 45.2. Humanoid robots.
<details>
<summary>Image 7 Details</summary>

### Visual Description
## Photograph: Humanoid Robot
### Overview
The image is a photograph of a humanoid robot designed to resemble a young woman. The robot is positioned against a dark background and is wearing clothing. The image is labeled with the letter "(f)" in the top-left corner.
### Components/Axes
* **Subject:** Humanoid robot
* **Background:** Dark, plain background
* **Clothing:** The robot is wearing a dark-colored shirt or blouse, and a purple shawl or scarf.
* **Label:** "(f)" in the top-left corner.
### Detailed Analysis or ### Content Details
The robot has a realistic human-like face with brown hair styled with bangs and side ponytails. The robot is wearing a dark button-down shirt and a purple shawl draped around its shoulders. The lighting is soft and even, highlighting the robot's features.
### Key Observations
* The robot's design aims for a high degree of realism.
* The clothing and styling contribute to the human-like appearance.
* The label "(f)" suggests this image is part of a series or figure in a larger document.
### Interpretation
The photograph showcases the advanced design and construction of a humanoid robot. The attention to detail in the robot's appearance, including its facial features, hair, and clothing, suggests an effort to create a realistic and relatable machine. The image likely serves to demonstrate the capabilities of robotics technology in creating human-like machines. The label "(f)" indicates that this image is part of a larger study or presentation, possibly comparing different robot designs or features.
</details>
<details>
<summary>Image 8 Details</summary>

### Visual Description
## Diagram: Robot Arm Reach
### Overview
The image shows a diagram of a robot arm, likely illustrating its reach and potential areas of focus. The diagram includes a stylized representation of the robot arm, a circular reach indicator, and areas of interest highlighted with blue grid patterns and a red spot.
### Components/Axes
* **Robot Arm:** A simplified, wireframe-style depiction of a robot arm.
* **Reach Indicator:** A red circle indicating the maximum reach of the arm.
* **Areas of Interest:** Two blue grid patterns on either side of the arm's "head" and a red spot on the "head" itself.
### Detailed Analysis
* The robot arm is positioned vertically in the center of the image.
* The red circle is centered around the "head" of the robot arm, indicating the area it can reach.
* The blue grid patterns are located on either side of the "head," possibly indicating areas of visual focus or sensor range.
* The red spot is located on the "head," possibly indicating a primary area of interest or a sensor location.
### Key Observations
* The diagram is a simplified representation, focusing on reach and areas of interest rather than precise mechanical details.
* The use of color (red and blue) helps to highlight key areas.
### Interpretation
The diagram likely serves to illustrate the operational capabilities of the robot arm, specifically its reach and areas of focus. The red circle provides a visual representation of the arm's workspace, while the blue grids and red spot highlight areas where the robot might be performing tasks or gathering information. The diagram could be used in a technical document to explain the robot's functionality or in a presentation to showcase its capabilities.
</details>
<details>
<summary>Image 9 Details</summary>

### Visual Description
## Diagram: Hand with Sensory Points
### Overview
The image depicts a stylized hand with overlaid sensory points, represented by small blue circles. A red sphere is positioned near the index finger. The hand is rendered in grayscale, while the sphere is red. The background is a light beige color.
### Components/Axes
* **Hand:** A grayscale representation of a human hand.
* **Sensory Points:** Small blue circles distributed across the hand's surface, concentrated on the fingertips and palm.
* **Red Sphere:** A red sphere positioned near the index finger.
* **Background:** Light beige color.
### Detailed Analysis
The sensory points are densely clustered on the fingertips and palm, suggesting areas of high sensitivity. The red sphere appears to be interacting with the index finger. The hand is in a relaxed, open position.
### Key Observations
* The distribution of sensory points highlights areas of tactile sensitivity.
* The red sphere's placement suggests interaction or contact with the hand.
### Interpretation
The image likely represents a concept related to tactile sensing or interaction with objects. The sensory points emphasize the hand's ability to perceive touch, while the red sphere symbolizes an external object being manipulated or sensed. The image could be used to illustrate concepts in robotics, haptics, or sensory perception.
</details>
<details>
<summary>Image 10 Details</summary>

### Visual Description
## Photograph: Pepper Robot
### Overview
The image is a photograph of a white Pepper robot standing indoors. The robot is holding a tablet displaying a graphical user interface. The background is a plain wall. The image is labeled with "(g)" in the top-left corner.
### Components/Axes
* **Robot:** A white Pepper robot with a humanoid form.
* **Tablet:** A tablet held by the robot, displaying a user interface with circular icons.
* **Background:** A plain, light-colored wall.
* **Label:** "(g)" in the top-left corner.
### Detailed Analysis
* **Robot Details:** The robot has a white body with visible joints and a head with large, expressive eyes. One eye appears to be illuminated with a blue light. The robot's right hand is raised slightly.
* **Tablet Interface:** The tablet displays a user interface with six circular icons arranged in two rows of three. The colors of the icons appear to be green, blue, and possibly other colors.
* **Background Details:** The wall is a light, neutral color, providing a clean backdrop for the robot.
* **Label Position:** The label "(g)" is located in the top-left corner of the image.
### Key Observations
* The robot is the central focus of the image.
* The tablet interface suggests the robot is interactive.
* The lighting is even, providing good visibility of the robot and its features.
### Interpretation
The photograph likely illustrates the Pepper robot in an interactive scenario, possibly demonstrating its capabilities in human-robot interaction. The tablet interface suggests the robot can be controlled or programmed to perform specific tasks. The image could be part of a series showcasing the robot's features and applications. The label "(g)" likely refers to a figure number within a larger document or publication.
</details>
A large number of humanoid robots have been developed over the last decades and many of them can, one way or other, be used to study human cognition. Given that all of them to date are very different from real humans- each of them, implicitly or explicitly, embodies certain types of abstractions- there is no universal platform, but they have all been developed with specific goals in mind. Here we present a few examples and discuss the ways in which they are employed in trying to ferret out the principles of human cognition. The categories shown in Figure 45.2 are musculoskeletal robots (Roboy and Kenshiro), 'baby' robots with sensorized skins (iCub and fetus simulators), and social interaction robots (Erica and Pepper).
In order to use the robots for learning their own complex dynamics and for building up a body schema, both Roboy and Kenshiro (Nakanishi et al. 2012) need to be equipped with many sensors so that they can 'experience' the effect of a particular actuation pattern. Given rich sensory feedback, using the principle that every action leads to sensory stimulation, both these robots can, in principle, employ motor babbling in order to learn how to move. Especially for Kenshiro, with his very large number of muscles, learning is a must. A very important step in this direction is the work of Richter et al. (2016), who have combined a musculoskeletal robotics toolkit (Myorobotics) with a scalable neuromorphic computing platform (SpiNNaker) and demonstrated control of a musculoskeletal joint with a simulated cerebellum.
Finally, if the interest is social interaction, it might be more productive to use robots like Erica or Pepper. Both Erica and Pepper are somewhat limited in their sensorimotor abilities (especially haptics), but are endowed with speech understanding and generation facilities; they can recognize faces and emotions; and they can realistically display any kind of facial expression.
## Musculoskeletal robots: Roboy and Kenshiro
Figure 45.2a. Roboy overview: The musculoskeletal design can be clearly observed. At this point, Roboy has 48 'muscles. ' Eight are dedicated to each of the shoulder joints. This can no longer be sensibly programmed: learning is a necessity. Currently, Roboy serves as a research platform for the EU/ FET Human Brain Project to study, among other things, the effect of brain lesions on the musculoskeletal system. Because it has the ability to express a vast spectrum of emotions, it can also be employed to investigate human-robot interaction, and as an entertainment platform.
Credit: © Embassy of Switzerland in the United States of America.
Figure 45.2b. Close- up of the muscle- tendon system. Although the shoulder joint is distinctly dissimilar to a human one- for example, it doesn't have a shoulder bladeit is controlled by eight muscles, which require substantial skills in order to move properly: which muscles have to be actuated to what extent in order to achieve a desired movement?
Credit: © Erik Tham/ Corbis Documentary/ Getty Images.
Figure 45.2c. Kenshiro's musculoskeletal setup. The musculoskeletal design is clearly visible. At this point, Kenshiro has 160 'muscles'- 50 in the legs, 76 in the trunk, 12 in the
shoulder, and 22 in the neck. In terms of musculoskeletal system, it is the one robot that most closely resembles the human. So, if learning of the dynamics in this system is the goal, Kenshiro will be the robot of choice. Note that although Kenshiro is 'closest' to a human in this respect, it is still subject to enormous abstractions. Currently, Kenshiro serves as a research platform at the University of Tokyo to investigate tendon- controlled systems with very many degrees of freedom (Nakanishi et al. 2012).
Credit: Photo courtesy Yuki Asano.
## 'Baby' robots with sensitive skins
Figure 45.2d. Fetus simulator. A musculoskeletal model of human fetus at 32 weeks of gestation has been constructed and coupled with a brain model comprising 2.6 million spiking neurons (Yamada et al. 2016). The figure shows the tactile sensor distribution, which was based on human two- point discrimination data.
Reproduced from Yasunori Yamada, Hoshinori Kanazawa, Sho Iwasaki, Yuki Tsukahara, Osuke Iwata, Shigehito Yamada, and Yasuo Kuniyoshi, An Embodied Brain Model of the Human Foetus, Scientific Reports , 6 (27893), Figure 1d, doi:10.1038/ srep27893 © 2016 Yasunori Yamada, Hoshinori Kanazawa, Sho Iwasaki, Yuki Tsukahara, Osuke Iwata, Shigehito Yamada, and Yasuo Kuniyoshi. This work is licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0). It is attributed to the authors Yasunori Yamada, Hoshinori Kanazawa, Sho Iwasaki, Yuki Tsukahara, Osuke Iwata, Shigehito Yamada, and Yasuo Kuniyoshi.
Figure 45.2e. The iCub baby humanoid robot. The iCub (Metta et al. 2010) has the size of a roughly four- year-old child and corresponding sensorimotor capacities: 53 degrees of freedom (electrical motors), two stereo cameras in a biomimetic arrangement, and over 4,000 tactile sensors covering its body. The panel shows the robot performing self- touch and corresponding activations in the tactile arrays of the left forearm and right index finger.
## Social interaction robots: Erica and Pepper
Figure 45.2f. Erica, the latest creation of Prof. Hiroshi Ishiguro, was designed specifically with the goal of imitating human speech and body language patterns, in order to have 'highly natural' conversations. It also serves as a tool to study human-robot interaction, and social interaction in general. Moreover, because of its close resemblance to humans, the 'uncanny valley'- the fact that people get uneasy when the robots are too humanlike- hypothesis can be further explored and analyzed (see, e.g., Rosenthal- von der Pütten, Marieke, and Weiss 2014, where the Geminoid HI- 1 modeled after Prof. Ishiguro was used).
Credit: Photo courtesy of Hiroshi Ishiguro Laboratory, ATR and Osaka University.
Figure 45.2g. Pepper, a robot developed by Aldebaran (now Softbank Robotics), although much simpler (and much cheaper!) than Erica, is used successfully on the one hand to study social interaction, for entertainment, and to perform certain tasks (such as selling Nespresso machines to customers in Japan).
## The Role of Development
A very powerful approach to deepen our understanding of cognition, and one that has been around for a long time in psychology and neuroscience, is to study ontogenetic development. During the past two decades or so, this idea has been adopted by the robotics community and has led to a thriving research field dubbed 'developmental robotics.' Now, a crucial part of ontogenesis takes place in the uterus. There, tactile sense is the first to develop (Bernhardt 1987) and may thus play a key role in the organism's learning about first sensorimotor contingencies, in particular, those pertaining to its own body (e.g., hand- to- mouth behaviors). Motivated by this fact, Mori and Kuniyoshi (2010) developed a musculoskeletal fetal simulator with over 1,500 tactile receptors, and studied the effect of their distribution on the emergence of sensorimotor behaviors. Importantly, with a natural (non-homogeneous) distribution, the fetus developed 'normal' kicking and jerking movements (i.e., similar to those observed in a human fetus), whereas with a homogeneous allocation it did not develop any of these behaviors. Yamada et al. (2016), using a similar fetal simulator and a large spiking neural network brain model, have further studied the effects of intrauterine (vs. extrauterine) sensorimotor experiences on cortical learning of body representations. A physical version- the fetusoid- is currently under development (Mori et al. 2015). Somatosensory (tactile and proprioceptive) inputs continue to be of key importance also in early infancy when 'infants engage in exploration of their own body as it moves and acts in the environment. They babble and touch their own body, attracted and actively involved in investigating the rich intermodal redundancies, temporal contingencies, and spatial congruence of self- perception' (Rochat 1998, p. 102). The iCub baby humanoid robot (Metta et al. 2010) (Box 45.2E), equipped with a wholebody tactile array (Maiolino et al. 2013) comprising over 4,000 elements, is an ideal platform to study these processes. The study of Roncone et al. (2014) on self- calibration using self- touch is a first step in this direction.
## Applications Of Human- Like Robots
Finally, this research strand- employing humanoid robots to study human cognitionhas also important applications. In traditional domains and conventional tasks- such as pick- and- place operations in an industrial environment- current factory automation robots are doing just fine. However, robots are starting to leave these constrained domains, entering environments that are far less structured and are starting to share their living space with humans. As a consequence, they need to dynamically adapt to unpredictable interactions and guarantee their own as well as others' safety at every moment. In such cases, more human- like characteristics- both physical and 'mental'are desirable. Box 45.3 illustrates how more brain- like body representations can help robots to become more autonomous, robust, and safe. The possibilities for future applications of robots with cognitive capacities are enormous, especially in the rapidly
## BOX 45.3 Body schema in humans vs. robots
Figure 45.3. Characteristics of body representations.
<details>
<summary>Image 11 Details</summary>

### Visual Description
## Conceptual Diagram: Biological Body Representations vs. Robot Performance
### Overview
The image is a conceptual diagram illustrating the relationship between modeling mechanisms of biological body representations and achieving better performance in robots. It uses a 3D space defined by three axes: fixed/plastic, centralized/distributed, and implicit/explicit (multimodal/amodal or unimodal). The diagram suggests a progression from biological systems (monkey brain) to robotic systems, with an arrow indicating the direction of influence.
### Components/Axes
* **Axes:**
* Vertical Axis: "fixed" (top) to "plastic" (bottom)
* Vertical Axis (parallel): "centralized" (top) to "distributed" (bottom)
* Horizontal Axis: "implicit" (left) to "explicit" (right)
* Horizontal Axis (parallel): "multimodal" (left) to "amodal or unimodal" (right)
* **Elements:**
* Top-Left: Image of a monkey and the text "1. Modeling mechanisms of biological body representations" enclosed in a rounded rectangle.
* Bottom-Left: Image of a brain with highlighted regions.
* Center: Image of a humanoid robot.
* Top-Right: Diagram of a robotic arm with labeled components (a1, a2, a3, θ1, θ2, θ3, px, py, pz) and corresponding equations.
* pₓ = cosθ₁ (a₃ cos(θ₂ + θ₃) + a₂ cosθ₂)
* pᵧ = sinθ₁ (a₃ cos(θ₂ + θ₃) + a₂ cosθ₂)
* p₂ = a₃ sin(θ₂ + θ₃) + a₂ sinθ₂ + a₁
* Bottom-Center: Text "2. Better performance of robots - autonomy, robustness, safety" enclosed in a rounded rectangle.
* Red Arrow: Points from the monkey/robot area towards the brain area.
### Detailed Analysis
* **Fixed/Plastic Axis:** This axis likely refers to the rigidity or adaptability of the system. "Fixed" implies a pre-defined structure, while "plastic" suggests a system capable of learning and adapting.
* **Centralized/Distributed Axis:** This axis describes the location of control or processing. "Centralized" indicates a single control center, while "distributed" implies multiple, interconnected processing units.
* **Implicit/Explicit Axis:** This axis refers to the level of awareness or accessibility of the information. "Implicit" suggests subconscious or embedded knowledge, while "explicit" indicates readily available and understandable information.
* **Monkey/Robot Area:** The monkey and robot, along with the robotic arm diagram, are positioned towards the "fixed," "centralized," "explicit," and "amodal or unimodal" end of the spectrum.
* **Brain Area:** The brain image is positioned towards the "plastic," "distributed," "implicit," and "multimodal" end of the spectrum.
* **Arrow:** The red arrow indicates a flow or influence from the "fixed," "centralized," "explicit," and "amodal or unimodal" side (monkey/robot) to the "plastic," "distributed," "implicit," and "multimodal" side (brain).
### Key Observations
* The diagram contrasts biological systems (monkey brain) with robotic systems.
* The arrow suggests that insights from biological systems can inform the development of better robots.
* The axes highlight key differences between biological and robotic systems in terms of adaptability, control, and information representation.
### Interpretation
The diagram illustrates the idea that understanding how biological systems represent and control movement (e.g., the monkey's brain) can lead to improvements in robot design and performance. The progression from "fixed" and "centralized" robotic systems towards more "plastic" and "distributed" systems, inspired by biological models, could result in robots with greater autonomy, robustness, and safety. The diagram suggests that robots can benefit from incorporating principles of plasticity, distributed control, and multimodal sensory integration, which are characteristic of biological systems. The equations provided for the robotic arm likely represent a simplified kinematic model, which could be enhanced by incorporating more biologically-inspired control strategies.
</details>
Credit: Monkey photo source: Einar Fredriksen/ Flickr/ Attribution- ShareAlike 4.0 International (CC BY- SA 4.0)
Credit: Brain image source: Hugh Guiney/ Attribution- ShareAlike 3.0 Unported (CC BY- SA 3.0)
Credit: Line drawing and equations source: Reproduced with the permission of Dr. Hugh Jack from http:// www.engineeronadisk.com
Credit: iCub Robot source: © iCub Facility- IIT, 2017
A typical example of a traditional robot and its mathematical model is depicted in the upper right of Figure 45.3. The robot is an arm consisting of three segments with three joints between the base and the final part- the end- effector. Its model is below the robot- the forward kinematics equations that relate configuration of the robot (joint positions θ 1 , θ 2, θ 3 ) to the Cartesian position of the end- effector (p x , p y , p z ). The model has the following characteristics: (1) it is explicit- there is a one- to- one correspondence between its body and the model (a 1 in the model is the length of the first arm segment, for example); (2) it is unimodalthe equations directly describe physical reality; one sensory modality (proprioceptionjoint angle values) is needed to get the correct mapping in the current robot state; (3) it is centralized- there is only one model that describes the whole robot; (4) it is fixed- normally, this mapping is set and does not change during the robot operation. Other models/ mappings are typically needed for robot operation, such as inverse kinematics, differential kinematics, or models of dynamics (dealing with forces and torques), but they would all share the abovementioned characteristics (see Hoffmann et al. 2010 for a survey).
As pointed out earlier, animals and humans have different bodies than robots; they also have very different ways of representing them in their brains. The panel in the lower left shows the rhesus macaque and below some of the key areas of its brain that deal with body representations (see, e.g., Graziano and Botvinick 2002). There is ample evidence that these representations differ widely from the ones traditionally used in robotics- namely, 'the body in the brain' would be (1) implicitly represented- there would hardly be a 'place' or
a 'circuit' encoding, say, the length of a forearm; such information is most likely only indirectly available and possibly in relation to other variables; (2) multimodal- drawing mainly from somatosensory (tactile and proprioceptive) and visual, but also vestibular (inertial) and closely coupled to motor information; (3) distributed- there are numerous distinct, but partially overlapping and interacting representations that are dynamically recruited depending on context and task; (4) plastic- adapting over both long (ontogenesis) and short time scales, as adaptation to tool use (e.g., Iriki et al. 1996) or various body illusions testify (e.g., humans start feeling ownership over a rubber hand after minutes of synchronous tactile stimulations of the hand replica and their real hand under a table; Botvinick and Cohen 1998).
The iCub robot 'walking' from the top right to the bottom left in the figure is illustrating two things. First, in order to be able to model the mechanisms of biological body representations, the traditional robotic models are of little use- a radically different approach needs to be taken. Second, by making the robot models more brain- like, we hope to inherit some of the desirable properties typical of how humans and animals master their highly complex bodies. Autonomy and robustness or resilience are one such case. It is not realistic to think that conditions, including the body, will stay constant over time and a model given to the robot by the manufacturer will always work. Inaccuracies will creep in due to wear and tear and possibly even more dramatic changes can occur (e.g., a joint becomes blocked). Humans and animals display a remarkable capacity for dealing with such changes: their models dynamically adapt to muscle fatigue, for example, or temporarily incorporate objects like tools after working with them, or reallocate 'brain territory' to different body parts in case of amputation of a limb. Robots thus also need to perform continuous self- modeling (Bongard et al. 2006) in order to cope with such changes. Finally, unlike factory robots that blindly execute their trajectories and thus need to operate in cages, humans and animals use multimodal information to extend the representation of their bodies to the space immediately surrounding them (also called peripersonal space). They construct a 'margin of safety, ' a virtual 'bubble' around their bodies that allows them to respond to potential threats such as looming objects, warranting safety for them and also their surroundings (e.g., Graziano and Cooke 2006). This is highly desirable in robots as well, and can transform them from dangerous machines to collaborators possessing whole- body awareness like we do. First steps along these lines in the iCub were presented by Roncone et al. (2016).
growing area of service robotics, where robots perform tasks in human environments. Rather than accomplishing them autonomously, they often do it in cooperation with humans, which constitutes a big trend in the field. In cooperative tasks, it is of course crucial that the robots understand the common goals and the intentions of the humans in order to be successful. In other words, they require substantial cognitive skills. We have barely started exploiting the vast potential of these types of cognitive machines.
## Conclusion
Our analysis so far has demonstrated that robots fit squarely into the embodied and pragmatic (action- oriented) turn in cognitive sciences (e.g., Engel et al. 2013), which
implies that whole behaving systems rather than passive subjects in brain scanners need to be studied. Robots provide the necessary grounding to computational models of the brain by incorporating the indispensable brain- body-environment coupling (Pezzulo et al. 2011). The advantage of synthetic methodology, or 'understanding by building' (Pfeifer and Bongard 2007), is that one learns a lot in the process of building the robot and instantiating the behavior of interest. The theory one wants to test thus automatically becomes explicit, detailed, and complete. Robots become virtual experimental laboratories retaining all the virtues of 'theories expressed as simulations' (Cangelosi and Parisi 2002), but bring the additional advantage that there is no 'reality gap': there is real physics and real sensory stimulation, which lends more credibility to the analysis if embodiment is at center stage.
We are convinced that robots are the right tools to help us understand the embodied, embedded, and extended nature of cognition because their makeup- physical artifacts with sensors and actuators interacting with their environment- automatically warrants the necessary ingredients. It seems that they are particularly suited for investigations of cognition from bottom up (Pfeifer et al. 2014), where development under particular constraints in brain-body-environment coupling is crucial (e.g., Thelen and Smith 1994). It also becomes possible to simulate conditions that one would not be able to test in humans or animals- think of the simulation of fetal ontogenesis while manipulating the distribution of tactile receptors (Mori and Kuniyoshi 2010). Furthermore, many additional variables (such as internal states of the robot) become easily accessible and lend themselves to quantitative analysis, such as using methods from information theory. Therefore, the combination of a robot with sensorimotor capacities akin to humans, the possibility of emulating the robot's growth and development, and finally the ease of access to all internal variables that can be subject to rigorous quantitative investigations create a very powerful tool to help us understand cognition.
We want to close with some thoughts on whether it is possible to realize- next to embodied, embedded, and extended- enactive robots as well. Most researchers in embodied AI/ cognitive robotics automatically adopt the perspective of extended functionalism (Clark 2008; Wheeler 2011), whereby the boundaries of cognitive systems can be extended beyond the agent's brain and even skin- including the body and environment. However, it has been pointed out by the proponents of enactive cognitive science (Di Paolo 2010; Froese and Ziemke 2009) that in order to fully understand cognition in its entirety, embedding the agent in a closed- loop sensorimotor interaction with the environment is necessary, yet may not be sufficient in order to induce important properties of biological agents such as intentional agency. In other words, one should not only study instances of individual closed sensorimotor loops as models of biological agents- that would be the recommendation of Webb (2009)- but one should also try to endow the models (robots in this case) with similar properties and constraints that biological organisms are facing. In particular, it has been argued that life and cognition are tightly interconnected (Maturana 1980; Thompson 2007), and a particular organization of living systems- which can be characterized by autopoiesis (Maturana 1980) or metabolism, for example- is crucial for the agent to truly acquire meaning in its interactions with the world. While these requirements are very hard to satisfy with the artificial systems of
today, Di Paolo (2010) proposes a way out: robots need not metabolize, but they should be subject to so- called precarious conditions. That is, the success of a particular instantiation of sensorimotor loops or neural vehicles in the agent is to be measured against some viability criterion that is intrinsic to the organization of the agent (e.g., loss of battery charge, overheating leading to electronic board problems resulting in loss of mobility, etc.). The control structure may develop over time, but the viability constraint needs to be satisfied, otherwise the agent 'dies' (McFarland and Boesser 1993). In a similar vein, in order to move from embodied to enactive AI, Froese and Ziemke (2009) propose to extend the design principles for autonomous agents of Pfeifer and Scheier (2001), requiring the agents to generate their own systemic identity and regulate their sensorimotor interaction with the environment in relation to a viability constraint. The unfortunate implication, however, is that research along these lines will in the short term most likely not produce useful artifacts. On the other hand, this approach may eventually give rise to truly autonomous robots with unimaginable application potential.
## Acknowledgments
M.H. was supported by a Marie Curie Intra European Fellowship (iCub Body Schema 625727) within the 7th European Community Framework Programme and the Czech Science Foundation under Project GA17- 15697Y.
## References
- Beer, R.D. and Williams, P .L. (2015). Information processing and dynamics in minimally cognitive agents. Cognitive Science , 39, 1- 38.
- Bernhardt, J. (1987). Sensory capabilities of the fetus. MCN: The American Journal of Maternal/ Child Nursing , 12(1), 44- 7.
- Blickhan, R., Seyfarth, A., Geyer, H., Grimmer, S., Wagner, H., and Günther, M. et al. (2007). Intelligence by mechanics. Philosophical transactions. Series A , 365, 199- 220.
- Bongard, J., Zykov, V ., and Lipson, H. (2006). Resilient machines through continuous selfmodeling. Science , 314, 1118- 21.
- Botvinick, M. and Cohen, J. (1998). Rubber hands 'feel' touch that eyes see. Nature , 391(6669), 756.
- Braitenberg, V. (1986). Vehicles- experiments in synthetic psychology . Cambridge, MA: MIT Press.
- Brooks, R. (1986). A robust layered control system for a mobile robot. IEEE Journal of Robotics and Automation , 2(1), 14- 23.
- Brooks, R.A. (1989). A robot that walks: emergent behaviors from a carefully evolved network. Neural Computation , 1, 153- 62.
- Brooks, R.A. (1991a). Intelligence without reason. In: J. Myopoulos (ed.), Proceedings of the Twelfth International Joint Conference on Artificial Intelligence (vol. 1). San Francisco, USA: Morgan Kaufmann, pp. 569-95.
- Brooks, R.A. (1991b). Intelligence without representation. Artificial Intelligence , 47, 139- 59.
- Bührmann, T., Di Paolo, E., and Barandiaran, X. (2013). A dynamical systems account of sensorimotor contingencies. Frontiers in Psychology , 4, 285.
- Cangelosi, A. and Parisi, D. (2002). Computer simulation: a new scientific approach to the study of language evolution. In: Simulating the evolution of language . London: Springer Science & Business Media, pp. 3- 28.
- Clark, A. (2008). Supersizing the mind: embodiment, action, and cognitive extension . New York: Oxford University Press.
- Clark, A. and Grush, R. (1999). Towards cognitive robotics. Adaptive Behaviour , 7(1), 5- 16.
- Collins, S., Ruina, A., Tedrake, R., and Wisse, M. (2005). Efficient bipedal robots based on passive dynamic walkers. Science , 307, 1082- 5.
- Dennett, D. (1995). Darwin's dangerous idea . New York: Simon & Schuster.
- Di Paolo, E. (2010). Robotics inspired in the organism. Intellectica , 53- 54, 129- 62.
- Engel, A.K., Maye, A., Kurthen, M., and König, P . (2013). Where's the action? The pragmatic turn in cognitive science. Trends in Cognitive Sciences , 17(5), 202- 9.
- Floreano, D., Pericet- Camara, R., Viollet, S., Ruffier, F., Brückner, A., Leitel, R. et al. (2013). Miniature curved artificial compound eyes. Proceedings of the National Academy of Sciences , 110(23), 9267- 72.
- Fodor, J. (1975). The language of thought . Cambridge, MA: Harvard University Press.
- Franceschini, N., Pichon, J., and Blanes, C. (1992). From insect vision to robot vision. Philosophical transactions of the Royal Society of London. Series B , Biological sciences, 337, 283- 94.
- Froese, T. and Ziemke, T. (2009). Enactive artificial intelligence: investigating the systemic organization of life and mind. Artificial Intelligence , 173(3), 466- 500.
- Garofalo, M., Nieus, T., Massobrio, P ., and Martinoia, S. (2009). Evaluation of the performance of information theory- based methods and cross- correlation to estimate the functional connectivity in cortical networks. PLoS ONE , 4(8), e6482.
- Graziano, M. and Botvinick, M. (2002). How the brain represents the body: insights from neurophysiology and psychology. In: W . Prinz and B. Hommel (eds.), Common mechanisms in perception and action: attention and performance . New York: Oxford University Press, 136- 57.
- Graziano, M. and Cooke, D. (2006). Parieto- frontal interactions, personal space, and defensive behavior. Neuropsychologia , 44(6), 845- 59.
- Grush, R. (2004). The emulation theory of representation- motor control, imagery, and perception. Behavioral and Brain Sciences, 27, 377- 442.
- Haugeland, J. (1985). Artificial intelligence: the very idea . Cambridge, MA: MIT Press.
- Hoffmann, M., Marques, H., Arieta, A., Sumioka, H., Lungarella, M., and Pfeifer, R. (2010). Body schema in robotics: a review. IEEE Transactions on Autonomous Mental Development , 2(4), 304- 24.
- Hoffmann, M. and Pfeifer, R. (2011). The implications of embodiment for behavior and cognition: animal and robotic case studies. In: W . Tschacher and C. Bergomi (eds.), The implications of embodiment: cognition and communication . Exeter: Imprint Academic, pp. 31- 58.
- Hoffmann, M., Schmidt, N.M., Pfeifer, R., Engel, A.K., and Maye, A. (2012). Using sensorimotor contingencies for terrain discrimination and adaptive walking behavior in the quadruped robot Puppy . In: T. Ziemke, C. Balkenius, and J. Hallam (eds.), From animals to animats 12 . SAB 2012. Lecture Notes in Computer Science (vol. 7426). Berlin, Heidelberg: Springer, pp. 54- 64.
- Hoffmann, M., Stepanova, K., and Reinstein, M. (2014). The effect of motor action and different sensory modalities on terrain classification in a quadruped robot running with multiple gaits. Robotics and Autonomous Systems , 62(12), 1790- 8.
- Iriki, A., Tanaka, M., and Iwamura, Y. (1996). Coding of modified body schema during tool use by macaque postcentral neurones. Neuroreport , 7 , 2325- 30.
- Jeannerod, M. (2001). Neural simulation of action: a unifying mechanism for motor cognition. NeuroImage , 14, 103- 9.
- Koditschek, D.E., Full, R.J., and Buehler, M. (2004). Mechanical aspects of legged locomotion control. Arthropod Structure and Development , 33, 251- 72.
- Lichtensteiger, L. (2004). On the interdependence of morphology and control for intelligent behav ior [PhD dissertation]. Zurich: University of Zurich.
- Lungarella, M. and Sporns, O. (2006). Mapping information flow in sensorimotor networks. PLoS Computational Biology , 2, 1301- 12.
- Maiolino, P ., Maggiali, M., Cannata, G., Metta, G., and Natale, L. (2013). A flexible and robust large scale capacitive tactile system for robots. Sensors Journal, IEEE , 13(10), 3910- 7.
- Maturana, H.a.V.F. (1980). Autopoiesis and cognition: the realization of the living . Dordrecht: D. Reidel Publishing.
- Maye, A. and Engel, A.K. (2012). Time scales of sensorimotor contingencies. In: H. Zhang, A. Hussain, D. Liu, and Z. Wang (eds.), Advances in brain inspired cognitive systems . BICS 2012. Lecture Notes in Computer Science (vol. 7366). Berlin, Heidelberg: Springer, 240- 9.
- McGeer, T. (1990). Passive dynamic walking. The International Journal of Robotics Research , 9(2), 62- 82.
- Metta, G., Natale, L., Noei, F., Sandini, G., Vernon, D., Fadiga L. et al. (2010). The iCub humanoid robot: an open- systems platform for research in cognitive development. Neural Networks , 23(8- 9), 1125- 34.
- Mori, H., Akutsu, D., and Asada, M. (2015). Fetusoid35: a robot research platform for neural development of both fetuses and preterm infants and for developmental care. In: A. Duff, N.F. Lepora, A. Mura, T.J. Prescott, and P .F.M.J. Verschure (eds.), Biomimetic and biohybrid systems . Living Machines 2014. Lecture Notes in Computer Science (vol. 8608). New York: Springer International Publishing, pp. 411- 13.
- Mori, H. and Kuniyoshi, Y. (2010). A human fetus development simulation: self- organization of behaviors through tactile sensation. In: 2010 IEEE 9th International Conference on Development and Learning . doi:10.1109/ DEVLRN.2010.5578860
- Nakanishi, Y., Asano, Y., Kozuki, T., Mizoguchi, H., Motegi, Y., Osada, M. et al. (2012). Design concept of detail musculoskeletal humanoid 'Kenshiro'- toward a real human body musculoskeletal simulator. In: 2012 12th IEEE- RAS International Conference on Humanoid Robots (Humanoids) . doi:10.1109/HUMANOIDS.2012.6651491
- Olsson, L., Nehaniv, C.L., and Polani, D. (2004). Sensory channel grouping and structure from uninterpreted sensory data. In: Proceedings. 2004 NASA/ DoD Conference on Evolvable Hardware , 2004 . doi:10.1109/ EH.2004.1310825
- O'Regan, J.K. and Noë, A. (2001). A sensorimotor account of vision and visual consciousness. Behavioral and Brain Sciences , 24, 939- 1031.
- Pezzulo, G., Barsalou, L.W ., Cangelosi, A., Fischer, M.H., McRae, K., and Spivey, M.J. (2011). The mechanics of embodiment: a dialog on embodiment and computational modeling. Frontiers in Psychology , 2, 5.
- Pfeifer, R. and Bongard, J.C. (2007). How the body shapes the way we think: a new view of intelligence . Cambridge, MA: MIT Press.
- Pfeifer, R., Iida, F., and Lungarella, M. (2014). Cognition from the bottom up: on biological inspiration, body morphology, and soft materials. Trends in Congnitive Sciences , 18(8), 404- 13. Pfeifer, R. and Scheier, C. (2001). Understanding intelligence . Cambridge, MA: MIT Press.
- Pylyshyn, Z. (1984). Computation and cognition: toward a foundation for cognitive science . Cambridge, MA: MIT Press.
- Quiroga, R.Q., and Panzeri, S. (2009). Extracting information from neuronal populations: information theory and decoding approaches. Nature Reviews Neuroscience , 10(3), 173- 85.
- Richter, C., Jentzsch, S., Hostettler, R., Garrido, J.A., Ros, E., Knoll, A. et al. (2016). Musculoskeletal robots: scalability in neural control. IEEE Robotics & Automation Magazine, 23(4), 128-37. doi:10.1109/ MRA.2016.2535081
- Rochat, P . (1998). Self- perception and action in infancy. Experimental Brain Research , 123, 102- 9.
- Roncone, A., Hoffmann, M., Pattacini, U., Fadiga, L., and Metta, G. (2016). Peripersonal space and margin of safety around the body: learning tactile- visual associations in a humanoid robot with artificial skin. PLoS ONE , 11(10), e0163713.
- Roncone, A., Hoffmann, M., Pattacini, U., and Metta, G. (2014). Automatic kinematic chain calibration using artificial skin: self- touch in the iCub humanoid robot. In: 2014 IEEE International Conference on Robotics and Automation (ICRA) . doi:10.1109/ ICRA.2014.6907178
- Rosenthal- von der Pütten, A.M., Marieke, A., and Weiss, A. (2014). The uncanny in the wild: analysis of unscripted human- android interaction in the field. International Journal of Social Robotics , 6(1), 67- 83.
- Saranli, U., Buehler, M., and Koditschek, D. (2001). RHex: a simple and highly mobile hexapod robot. The International Journal of Robotics Research , 20, 616- 31.
- Schatz, T. and Oudeyer, P .Y. (2009). Learning motor dependent Crutchfield's information distance to anticipate changes in the topology of sensory body maps. 2009 IEEE 8th International Conference on Development and Learning . doi:10.1109/ DEVLRN.2009.5175526
- Schmidt, N., Hoffmann, M., Nakajima, K., and Pfeifer, R. (2013). Bootstrapping perception using information theory: case studies in a quadruped robot running on different grounds. Advances in Complex Systems , 16(2- 3), 1250078.
- Song, Y.M. et al. (2013). Digital cameras with designs inspired by the arthropod eye. Nature , 497(7447), 95- 9.
- Thelen, E. and Smith, L. (1994). A dynamic systems approach to the development of cognition and action . Cambridge, MA: MIT Press.
- Thompson, E. (2007). Mind in life: biology, phenomenology, and the sciences of mind . Cambridge, MA: MIT Press.
- Walter, G.W. (1953). The living brain . New York: Norton & Co.
- Webb, B. (2004). Neural mechanisms for prediction: do insects have forward models? Trends in Neurosciences , 27(5), 278- 82.
- Webb, B. (2009). Animals versus animats: or why not model the real iguana? Adaptive Behavior , 17 , 269- 86.
- Wheeler, M. (2011). Embodied cogntion and the extended mind. In: J. Garvey (ed.), The Continuum companion to philosophy of mind . London: Continuum, pp. 220- 36.
- Yamada, Y., Kanazawa, H., Iwasaki, S., Tsukahara, Y., Iwata, O., Yamada, S. et al. (2016). An embodied brain model of the human foetus. Scientific Reports, 6. doi:10.1038/srep27893