November 2019
Text by Ray Gallon

Image: © PhonlamaiPhoto/

Ray Gallon is president and co-founder of The Transformation Society. He has over 20 years’ experience in the technical content industries, including major companies such as IBM, Alcatel, and General Electric Health Care. He has contributed to numerous books, journals, and magazines, and is the editor of The Language of Technical Communication (XML Press).



Learning to learn with machines

The rapid advance of technology provides our society with many social and ethical challenges. How can we guide users through a world fostering a society of machines and humans?

Greetings, readers, here’s a riddle for you to solve: Figure 1 presents part of a Spanish family tree, showing how Spanish naming conventions work. If you don’t already know this, take a look at it, and see if you can figure out the family names for Marta and Carlos at the bottom of the chart.


Figure 1: Diagram showing Spanish naming conventions. What will the family names be for Marta and Carlos?
Source: Ray Gallon, The Transformation Society


In all probability, you had no problem deducing the answer[1], even if you knew nothing about the subject beforehand. The chart provides enough of an example that most people are able to generalize from the examples to a new case.

This capacity to generalize is key to many learning situations. It is pivotal to designing user assistance. Roger C. Schank, researcher in Artificial Intelligence (AI) and machine learning, and a specialist in learning by doing, refers to "scenes," similar to, for example, a set of steps in a procedure, or the whole procedure itself. In any given context, we acquire multiple scenes that contribute to a set of interrelated activities that Schank calls "Memory Organization Packets" (MOPs). In UX terms, an MOP might be the set of procedures needed to complete a task. Schank says that we generalize a scene from one MOP to another. For example, when we learn a scene, such as recognizing a friend and greeting them at a party, we can easily transfer appropriate behavior to meeting a colleague in a professional context (Schank, 1995).

Generalization, in one form or other, is central to a number of theories of learning and cognition, and plays an important role in the training of machine learning algorithms.

Gestalt – the whole is "other" than the sum of its parts

Gestalt psychology, which was very fashionable in the seventies, is based on a holistic view of how the brain works:


  • It tries to understand how we acquire and maintain stable percepts in a noisy world.
  • It assumes that the brain is holistic, parallel, and analog, with self-organizing tendencies.
  • The human eye sees objects in their entirety before perceiving their individual parts.



Figure 2: Reification: the spare is perceived, not drawn
Source: Slehar at English Wikipedia – Commons Public Domain


This last point would seem to be contradicted by the most recent research in brain function and perception. The modular doctrine of vision (Zeki & Bartels, 1998a & b; Aleksander & Dumall, 2000) proposes that the visual brain consists of many distributed perceptual systems, each one responsible for the processing of different visual attributes. Research shows that color is perceived slightly earlier than form, and processed almost simultaneously, and that movement is perceived about 50 milliseconds after form (Viviani & Aymoz, 2001). Nonetheless, all of this modular, asynchronous perception is interpreted and merged together by our brain for operational reasons, perhaps reinforcing the holistic orientation of gestalt.

In this holistic, analog, parallel processing system, gestalt psychology suggests that we make generalizations not only by analogy, but also by similarity and grouping. Thus, we see a "man in the moon" because features on the lunar surface are grouped in such a way as to suggest a face. We can also, using a process called reification, interpolate shapes that are not physically present, as in Figure 2, where we can see a sphere that is not actually drawn. We infer it and generalize it from our previous knowledge of the sphere form, our acquired visual grammar, and our holistic vision. In other words, we fill in the blank spaces to complete them. John Carroll calls this inferential learning and in his seminal writing on minimalism in technical communication, he states that users anchor their learning better when they need to do this kind of work, rather than having everything explicitly laid out for them (Carroll, 1990).

Constructivism – from inference to social knowledge building

Generating a new idea means systematic combination and recombination of various meanings, which come from the social and cultural environment.

Nikita Basov, St. Petersburg State University – Bielefeld University


It seems impossible to speak about "reality" without mentioning "perception." In fact, what we view as real is, most often, our perception of what is real. We like to think of science as representing facts, but "in reality" (pun intended) science is a constructed model that we use until it no longer works for us. Then we discard it for a new model that works better. The classic example is when we exchanged the Ptolemaic geocentric model of the cosmos for the Copernican heliocentric one. Our models are built from a mixture that includes empirical observation, abstractions from mathematics and logic, and our cultural filters and biases. A theory or model becomes "knowledge" when there is consensus about it. It took a long time for Copernicus’ theory to achieve that consensus, and Galileo was condemned to house arrest for defending it. Yet today, it is considered a "fact" that the planets revolve around the sun.

Thus, knowledge is constructed socially, and depends on first-hand experiences (as in learning by doing). The learning theory known as constructivism is built on this principle:


  • Self-directed learners act on the environment to acquire and test new knowledge.
  • Instructors function as facilitators, not knowledge sources.
  • The learning context is central to learning itself. Learning is an active, social process.
  • Learners collaborate to arrive at shared understanding.

John Carroll's The Nurnberg Funnel (1990) tells us much the same thing in different words. If we want users to be able to find the information they need quickly and easily, they have to be self-directed. They learn by doing, acting on the environment – i.e. the product. The "instructor" (in this case, user assistance) guides the users’ learning; it does not spoon-feed them. We are often enjoined to understand the users’ world – because that is the "learning context" in which they acquire expertise in our products. That world includes interruptions, distractions, emotional swings, and interactions with colleagues who have different levels of expertise in the same product.

How many times, when learning new software, have you asked the local guru in your department for information about how to do something, or been asked by someone else?

How many times have you asked Siri, or Google, or a chatbot?

Can you identify any one source as more reliable than the others?

Hybrid learning

The rapid growth of deployed Industry 4.0 technology is creating an extra-sensorial field of interaction that amplifies human capabilities, not only in time and space, but also in memory (human and digital), cognitive processes, and social problem-solving. Machines are becoming protagonists and building their own communicative layers. Humans already interact with machines as if they were human on some level (Siri, Alexa, Cortana, chatbots, etc.). How far will we go in building a hybrid society of machines and humans? How will this affect learning processes – for both?

No one person can hold a whole culture, or the compendium of knowledge in a field, in their head. As developmental psychologist Lev Vygotsky said in the 1930’s, knowledge is developed and spread throughout communities, and is acquired by interacting in society (Vygotsky, 1978). In the near future, we are not only going to be sharing learning experiences and knowledge in communities of humans, but also in the cloud with machines. This hybrid community of shared knowledge will include AI agents and interactions that will become as important as humans in many ways (Lorenzo Galés & Gallon, 2018).

So, if humans developed culture from social interaction, can a machine equivalent of culture arise from the Internet of Things? In IoT, machines are assigned unique identities and then connect with other machines in large networks, creating a decisional ecosystem based on algorithmic languages and machine codes. Are they capable of generalizing from one set of big data to another or from one machine equivalent of an MOP to another?

There are researchers who think that generalization is almost the only way to develop intelligent machines. It is important to note here that this is not a pipe dream about "conscious" computers equipped with artificial general intelligence. It applies to the narrow, domain-specific kinds of AI we know already.

Jeff Hawkins, inventor of the Palm Pilot, has always been interested in how the human brain works, and has started a research foundation dedicated to studying it. In his book, On Intelligence (2005), co-authored with Sandra Blakeslee, he postulates that the human neocortex functions by pattern recognition. We remember the characteristics of objects, situations, experiences, etc. and automatically create predictions of what will come next based on them. Some parts of the neocortex receive low-level input from the senses, for example. Combining this input (or the memory of it) – similar to Schank’s scenes – with other inputs and memories, it creates more complex groupings (MOPs?) developing new layers of abstraction in the neocortex. Hawkins believes that this can serve as a unified model for both human and machine intelligence, even for artificial general intelligence.

Not only that, the plasticity of the human brain can provide a model for how we want Artificial Intelligence to function. Research is showing that the brain seems capable of learning to process signals from any sensor – for example, the auditory cortex can learn to interpret visual signals. In the same way, "a single deep learning model can jointly learn a number of large-scale tasks from multiple domains," according to Lukasz Kaiser at Google Brain, Aidan S. Gomez from the University of Toronto, and their team, who have successfully demonstrated the passage of machine learning from image to text with sufficient accuracy:

We demonstrate, for the first time, that a single deep learning model can jointly learn a number of large-scale tasks from multiple domains. The key to success comes from designing a multi-modal architecture in which as many parameters as possible are shared and from using computational blocks from different domains together

Kaiser, et al., 2017


Machine learning can be implemented using a simple, recursive routine, with dynamic access to a large quantity of examples (stored as big data – again analogous to scenes). More concretely, the process is a hierarchy of unsupervised searches, where the output of each one is used as input to the next:

Deep learning is all about hierarchies and abstractions. These hierarchies are controlled by the number of layers in the network along with the number of nodes per layer. Adjusting the number of layers and nodes per layer can be used to provide varying levels of abstraction.

In general, the goal of deep learning is to take low level inputs (feature vectors) and then construct higher and higher-level abstract "concepts" through the composition of layers. The assumption here is that the data follows some sort of underlying pattern generated by many interactions between different nodes on many different layers of the network

Rosebrock, 2017


The network layers can be located anywhere or spread throughout the Internet.

In short, humans and machines each develop their own flavors of MOP from reusable scenes – much as we do when using structured authoring models such as DITA. Reuse of these scenes implies generalizing their application from one MOP to another. The networking of these MOPs produces experience, culture, and finally, what we refer to as intelligence. This happens at individual levels, inside the brain or inside a computer, and also in networks of individuals: human-human, machine-machine, and hybrid (human-machine).

Connectivism – learning is more important than knowing

Given the emphasis on networks at multiple levels – from individual to community to global scales – it should not be surprising that the most recent evolution from social knowledge building puts most of the focus on the communicative ecosystem and posits that a learner gains more from the act of learning than from possession of knowledge. The connectivist theory, which is an evolution of constructivism, states:


  • Knowledge is activated in the world as much as in the head of an individual.
  • It exists through people (and by extension, machines) participating in activities.
  • Learning is the process of creating connections and elaborating a network.
  • Learning is more critical than knowing.
  • Perceiving connections between fields, ideas and concepts is a core skill.
  • Currency (accurate, up-to-date knowledge) is the intent of learning activities – and requires nurturing the networks.

Activating new knowledge in the world demands personal and collective investment. This means that emotions also play an important role. On a primary level, we all know the feeling of emotional satisfaction when we have successfully learned a new task that is useful for our working or personal lives. Emotional connections, which are parallel to the shared knowledge networks and communications networks in any knowledge building process, need to be part of the equation we information experts formulate when we design how users will learn about technological products in a future that includes AI and IoT. And because technology cannot have intrinsic emotions or ethics, it will be our role to create adequate systems of governance, and to make sure that good governance is practiced throughout our processes.


The collective, connectivist vision presented in this article comes from the need to consider human-machine interconnection and communication in a world where connectivity has unknown limits. Humans are going to experience unpredictable cognitive changes just by merging their goals and actions with those of AI agents. This implies social, epistemological, and philosophical challenges that redefine what it means to be alive in a hyperconnected, hybrid society.

As machines gain more cognitive abilities, our own cognitive mechanisms will change too as a result of our interactions with them, and this coexistence will develop in a very organic way, creating a new model of society. Our challenge as information experts will be to keep up at the metacognitive level – to understand what is happening to human perception and cognition, and to be able to guide users who are entering this world and help them navigate it with ease and pleasure, but also with the vigilance that this level of advanced technology requires.



Works cited



[1] Marta Lopez Garcia and Carlos Garcia Sanchez