February 2019
Text by Ursula Reuther and Paul Schmidt

Image: © Devrimb/istockphoto.com

Ursula Reuther studied Applied Linguistics and Translation Studies and works in the field of automated language processing and language technology both in development as well as in customer and project support. Her key interests include language standardization, testing tools, controlled language and terminology. She works with IAI Linguistic Content AG as well as Congree Language Technology GmbH as a consultant.


ursula.reuther[at]iailc.de
www.iailc.de


 


Dr. Paul Schmidt has been the CEO of IAI Linguistic Content AG since 2014. After studying philosophy and completing German and English studies, he did his doctorate in German linguistics. Has has a strong background in research and language processing and has also worked as a substitute professor for Machine Translation.


paul.schmidt[at]iailc.de
www.iailc.de


 


 

Definition of "Classifier"


A classifier is a machine, which, based on statistical values, learns to assign classes in the following domains. From texts that speak of the Milky Way, Pluto, stars, planets and moons, the classifier learns that the domain is "Astronomy", although this word does not appear in the text. New texts can then be assigned to correct domains on the basis of statistical knowledge.

Artificial Intelligence in technical communication

Artificial Intelligence is penetrating into more and more areas of our everyday work life. It is supposed to facilitate many tasks and even completely replace some. But what exactly is Artificial Intelligence and when does it come into play? A standpoint.

The term Artificial Intelligence (AI) is becoming increasingly common in the field of technical communication too. Be it in the provision of usage information like in the new standard iiRDS (intelligent information Request and Delivery Standard), intelligent use of information in itself or even in the creation of information. This article primarily explores how Artificial Intelligence could penetrate the field of language processing in technical communication. In the context of this article, we understand AI as a specific approach that has established itself in recent years: Artificial Intelligence based on deep learning.

Explanation of the term

Artificial Intelligence is a branch of computer science. It deals with the modeling, or at least the simulation, of intelligent behavior and enables a computer to solve problems independently. It is assumed that the solving of problems requires human abilities. The Turing test was developed to measure these abilities: a human evaluator converses with two conversation partners, one of which is a computer. The evaluator can neither see nor hear the two conversation partners. Based on the respective responses, if the evaluator cannot tell the computer from the human, the computer is said to be "intelligent".

This article, however, does not focus on the concept of AI itself but instead looks at all applications that are commonly referred to as AI. These include expert systems, knowledge bases, gaming robots, self-driving cars or even medical diagnostic methods, as well as automatic financial analysis programs and speech recognition systems.

The focus of our analysis is on systems that are already implemented in technical communication and are based on automated language processing. There is a new paradigm, which is based on specific AI systems, namely on learning systems. The question will now be: how these systems can change technical communication.

Characteristics of intelligence

Usually, the ability to learn and adapt to new circumstances is seen as an important characteristic of intelligence. The experiment with the Sphex wasp is often cited as an example of a very complex but unintelligent behavior.

When the wasp brings food to its burrow, it first deposits it on the threshold, gets inside the burrow to check for intruders, and then carries in the food. In the experiment, the scientists move the food slightly further away from the entrance of the burrow while the wasp is inside

checking the burrow. On emerging, the wasp carries the food back to the threshold and goes into the burrow to look for intruders again. This behavior can be repeated any number of times. The Sphex wasp is obviously incapable of adjusting to the new circumstance. There is no learning effect. This would however be expected of an intelligent system. To qualify as Artificial Intelligence, artifacts such as computers need to be able to deliver, or at least simulate, the effects of intelligence.

Symbolic or connectionist

In the field of AI, there has always been a debate on how intelligence should be modeled. One school of thought assumes that intelligence is rule-based. In the case of linguistic services for instance, this means that grammars, dictionaries, semantic rules, logical reasoning and conceptual systems are the explanation for the functioning of the system. Such AI is known as rule-based or symbolic AI.

The other school of thought is based on the simulation of human cognition by neural networks. Both approaches speak about the constitution of human cognition. The question regarding the orientation of AI could therefore also be expressed in the context of cognitive science: is the human mind a rule-based system or can it be modeled in a connectionist way?

Like machine translation, AI was one of the first computer applications ever. This field experienced a high point in the 1980s. During this time expert systems, dialog systems, modeling of micro-worlds, knowledge bases or ontologies were also developed. A distinction was made between connectionist and symbolic, rule-based models. However, the symbolic model was the predominant paradigm.

Speaking of Artificial Intelligence today, people usually refer to the connectionist model: the modeling of intelligent performances by simulating the structure of the human brain using artificial neural networks. Today, AI is intrinsically linked with the machine learning approach, especially deep learning [1]. One of the key elements of intelligence is that an intelligent system is capable of learning. While connectionist approaches are based on this capacity, symbolic approaches are not. Thus, symbolic approaches lack an essential element of intelligence.

AI in language processing

High-quality technical communication needs systems based on automated language processing. Computational linguistics that can provide such systems has undergone a change of direction in the past few years: it is headed toward a new generation of AI systems based on deep learning. Nevertheless, there remains a need for symbolic knowledge sources such as thesauri, terminologies, metadata and ontologies. Against this background we want to look at some relevant aspects of these developments for the future of technical communication, particularly the aspects of language testing, translation, tone of voice and terminology processing.

Generally speaking, rule-based AI systems can model certain linguistic abstraction levels in language processing – grammatical structure by phonological, morphological and syntactic rules and textual meaning by semantics. However, symbolic AI reaches its limits when we speak of contextual meaning, where pragmatics and discourse become relevant.

An area in which rule-based systems still deliver relatively good results is automatic text generation for certain text types because the percentage of stereotypical phrases is very high here. These include stock market reports or live tickers in sports for instance.

Rule-based AI is also widely used in language testing and content optimization. Here, authors receive help in creating linguistically correct and comprehensible information that also corresponds to the corporate language.

However, scientific disciplines such as linguistics, computational linguistics, machine translation and information acquisition have been increasingly leaning towards the new paradigm for some time. They no longer depend on dictionaries, rules, grammars or semantic resources. Machine learning has become possible because computer capacities and the existing large data volumes (Big Data) make it possible to simulate models of human learning in artificial neural networks, thereby achieving intelligent behavior. Meanwhile, neural networks have learned to translate very well. They are now on the verge of conquering many other areas that play a role in technical communication.

Application for grammar checks

The tools that are used in technical communication include language testing programs. One of their functionalities is the testing of grammatical and orthographic correctness. For linguistically based systems, grammar checks rely on morphosyntactic analysis. The program checks, for instance, whether the congruence within a nominal group is correct. A corresponding message is issued if there is a rule violation. Meanwhile, there are a number of grammar checkers in the market that are based on deep learning techniques. Two such very powerful and good grammar checkers for English are 'Grammarly' and 'Deep Grammar' [2, 3]. In these, the neural network independently determines which patterns and structures are correct. There are an incredible number of such patterns and structures in English grammar. The more the system is fed, the more it learns and the better it gets. The developers of Grammarly themselves recommend "… don’t throw your style guide away."

Evaluation of Grammarly is difficult for several reasons. Experiments that have been reported cannot be repeated, presumably because Grammarly has continue learning in the meantime. For example, a report states that Grammarly does not like repetitions, as shown in the following example [4]:

"Do a mindmap rather than an outline. I know your grade 10 social studies teacher told you that you always needed to prepare an outline."

A more recent version of Grammarly does not object to this error, but highlights the word "mindmap" instead. Likewise, an earlier experiment, where a scientific text was checked, yields the following result:

 

  • Grammarly does not recognize the term "indexation" and suggests that it be replaced by "induration".
  • Grammarly does not like the word "sophisticated".
  • Every single passive construction is highlighted.
  • In general, deverbal nouns are seen as an error.

As shown in Figure 1, a new version of the program does not repeat this result, but primarily shows extra spaces.

Figure 1: Grammarly check results.
Source: Grammarly

 

Grammarly reliably finds spelling and grammar errors and recognizes stylistic shortcomings such as missing articles or incorrect spaces. There are a large number of reports of experiments with Grammarly, which are difficult to repeat, which is probably attributable to the learning effect. This creates instability, which is not favorable for technical communication. However, this effect is seen in all systems based on deep learning and in machine translation as well.

To be able to reliably use Grammarly in technical communication, issues regarding configurability, trainability, adaptation to text types and domains or terminology integration must be clarified. Grammarly is an excellent tool for general language texts. This is especially true for texts by non-native speakers.

Machine-aided translation

The most impressive application of deep learning in language processing so far is the so-called neural machine translation. Here, deep learning based MT is superior to symbolic and statistical approaches in every respect. DeepL is an example of this [5]. The platform yields excellent results that in most cases are almost indistinguishable from human translation (Figure 2).

Figure 2: DeepL translation of an extract from Wikipedia "Electric motor".
Source: Wikipedia; DeepL

 

However, for technical communication, systems need to be adapted to the specific characteristics of typical text types and re-trained for the domains and text types. This is the linchpin of the entire application. If this is not, or merely partially achieved, a combination of different techniques needs to be employed. This could make up for the possible disadvantages of the deep learning approach, as we will see later.

Generally, however, one can say that the use of machine translation with deep learning mechanisms has the potential to radically change the field of translation in technical communication. In the future, the task of the translator would probably consist of checking whether proper terminology is used consistently. The translator would thus assume the role of editor.

With the right tone

Tone of voice refers to how a message is communicated. A message can be formulated in a casual, formal or scientifically serious way. Depending on its tone, the addressee may feel that he/she is or is not being taken seriously. The right tonality depends on various factors, such as who is being addressed and what image does a company want to project. The company tries to project an image through a message. The appropriate tonality should be chosen depending on the audience to be addressed and the message to be delivered.

Identifying and assessing the tone of voice would actually be a classic theme for deep learning. However, this would need a large number of annotated corpora of varying origins with information about which specific formulations go hand in hand with which tone of voice category. The system could then learn from this. Learning algorithms are probably also more complex here, because the tone of voice is not always only produced by individual phenomena, but is determined by the content as a whole.

As long as such corpora are not available in sufficient quantities, symbolic, i.e., rule-based and statistical methods can alternatively be used for this area of application. They offer quite satisfactory solutions. So far, symbolic AI methods that are based on linguistic analysis methods have been used in this field.

Processing of terminology

Deep learning is already in use in the field of terminology processing. Term extraction already deals with machine learning. The algorithms are based on statistical values or combined with symbolic methods. TF-IDF is a statistical measure that has been used for information retrieval for decades. TF-IDF stands for frequency – inverse document frequency and measures the "informativeness" of a term. The assumption is that only very informative expressions qualify as terms.

Domain relevance, meaning the importance of a term for a subject area, can be also be learned by statistical means. A so-called classifier can learn to assign the degree of domain relevance from large, annotated corpora.

The probability of occurrence of a term candidate can be determined in a statistical comparison between general corpora and domain-specific corpora. The probability of a term occurring in a domain-specific corpus is higher than in a general language corpus. A hackneyed word will occur more frequently in a general language corpus than in a domain-specific one.

Moreover, there is work being done on "relation extraction", "concept mining", "knowledge and ontology extraction". Here, knowledge based, symbolic methods produce excellent results. They can be extremely useful for terminology work. This was clearly demonstrated in an experiment for the automatic creation of an electromobility thesaurus as part of a project funded by the BMWi [6, 7] – Figure 3.

Figure 3: Automatic detection of hypernym and hyponym relations.
Source: WISSMER, project report (funded by BMWi)

 

Suitability of an approach

Based on these analyses, can we now assumed that certain technologies based on symbolic methods have become superfluous? And is it still true that new tools that are based on new approaches are finding their way into technical communication?

The answer is yes and no. Machine translation will definitely have a strong impact on the translation market. This is less evident in other areas. The reason is simple: systems that are based on deep learning have disadvantages as well. Learning requires tons of data and very long learning phases. However, the applications mostly work very well once learning is complete.

The situation at the end of the learning phase is: once learned, never forgotten. We don't know exactly why something works, and most of all there is no longer a direct influence on the results . Obviously, the system can continue to learn. But if a MT system has learned to translate, and it translates a passage in a text incorrectly, it is not very easy to correct the error by referring to a dictionary or other resources. First, it is not possible to correct the mistake at all. This would be true of a rule-based system. However, a combination of rule-based resources could be a possible solution.

If the system continues to learn, this could compromise the stability of the results. To put it more drastically: the MT system will provide different results for the same text in the morning and in the afternoon. All results are correct, but not the same. This is bad for standardization and very bad for the goal of technical communication, which is to obtain standardized and consistent formulations for the same facts.

Here, one is confronted with a similar problem like the one of translation memories that have grown without being checked for years: several source-language variants result in translation variants, which also violate the principle of standardization and linguistic standardization.

Holistic thinking

There seem to be areas where rule-based knowledge is necessary, where terminology collections, thesauri, ontologies, and linguistic knowledge are important. A principle of the previous AI was that there are applications that are more accessible to symbolic AI. These include sophisticated "intellectual" tasks such as drawing logical conclusions, understanding language, analyzing grammatical structures, information processing with ontologies and concept systems and, last but not the least, translating natural languages.

But there were tasks that were considered more suitable for connectionist models as well. In these cases, rule-based approaches would have limited use, for example in the recognition of spoken language or image recognition. The human characteristic of fault tolerance is important here. Humans can recognize spoken language even with a high degree of noise or background noise and can interpret and recognize images despite major defects.

Another aspect is that the human mind works "holistically". A statement is immediately interpreted as a whole and understood in one go. Or to overstate: the human mind does not first look at the grammar, then look up words in a mental lexicon, form word groups and then interpret them semantically. The holistic or integrated approach is also a domain of deep learning and can be simulated very well.

Perspectives for Artificial Intelligence

These above-mentioned points are important aspects of the current state of things with regard to AI. The basic tendency in view of the AI approach is that connectionist KI will penetrate more and more areas of intelligent services. According to Gereon Frahling, the founder of DeepL, "neural networks have developed an incredible understanding of language". It will open up unimagined perspectives for the future. Particularly if machine learning is combined with symbolic methods in the future, the areas of application of AI in technical communication will continue to open up and the results will continue to improve.

There are speculations that the use of KI systems could ultimately lead to the creation of "robot editors", who would perform the tasks of technical communicators one day. However that is very unlikely. Technical communication will continue to have intelligent services that cannot be learned by KI systems These include tasks in the field of terminology, translation or formulation of appropriate texts. In these areas, rule-based systems such as language testing software can aid the editor,but not replace him.

 

Links and literature

[1] Closs, Sissi (2017): Paradigmenwechsel für Technische Redakteure? In: Notes on Technical Communication, Volume 22, Pg. 97 ff. Gesellschaft für Technische Kommunikation – tekom Deutschland e. V.: Stuttgart.

[2] Grammarly

[3] Deep Grammar

[4] www.publicationcoach.com/grammarly

[5] DeepL

[6] Kucera, Gerd (2013): Elektromobilität. Die größte Informations- und Wissensdatenbank entsteht. In: Elektronik Praxis.

[7] Jaksch, Manfred; Schmidt, Paul (2013): Linguistische Techniken für eine Wissensplattform. In: DOK, Technologien, Strategien & Services für das digitale Dokument (Technologies, Strategies & Services for the Digital Document). May/June issue.