Intelligently linking information – with the help of knowledge graphs

Knowledge graphs are one of the top trends in applied Artificial Intelligence. But what exactly are they? And what could they be used for in a domain like technical communication?

tcworld magazine | October 2023 Text by Scott Abel, Martin Ley

Inhaltsübersicht

Image: © FabrikaCr/stockphoto.com

Bringing together different information from different information silos is becoming increasingly important. We will illustrate this challenge by putting ourselves in the situation of a fictitious mechanical engineering company, specializing in the development, production, sales, and servicing of hydraulic pumps.

Our company knows quite a lot about a particular hydraulic pump. It knows, for example, which product line a pump belongs to. It also knows in which plant the pump is installed, and where the plant is located. The company even knows when the pump reaches a critical maintenance cycle, what procedures then need to be performed, and which supplies are required during maintenance.

In this article, we will use this scenario to introduce the linking and virtualization of typically heterogeneous information. For this purpose, we will make use of the knowledgegraph approach. We will discuss the concepts and methods of semantic modeling and outline a possible IT architecture. At the end of the article, we will give an outlook and sketch additional scenarios that could be conceivable for our approach.

As a start: Information in rows and columns

Let’s begin with how information about a product is typically represented. Table 1 shows information presented in tabular format. (Where this information comes from and how it found its way into the table will not be discussed here; we will come back to this later.) Let’s just assume someone transferred it to an Excel file.

Row #	Product name	Product line	Plant	Location	Next maintenance interval (MI)	Required topic	Required supplies
1	WT-P-001	Turbo Pelot	PANL-01	Munich	07/2023	Replacing the pump head	Pump head 12 Seal XY
2	WT-P-002	Turbo Pelot	PANL-02	Wudongde	09/2023	Replacing the flask	Flask 12 Seal kit ZZ
3	WT-DS-001	Stream One	PANL-03	Saltstraumen	07/2023	Replacing the pump head	Pump head 23 Seal XY

Table 1: Tabular presentation of product knowledge

For (most) people, the interpretation of Table 1 is relatively simple: The table header indicates which properties (such as "product line" or "plant") a product has. Each row represents a specific hydraulic pump. Finally, the table cells contain the concrete values of those properties for each pump, column by column (e.g., in row #1 the value "Turbo Pelot" for the property "Product line"). For a computer, the processing of such a table is also possible, but what if, for example, the columns are interchanged, the sequential numbers are not uniform, or the labels of the properties and/or property values are different? What if the information is distributed among various tables? How can the corresponding dependencies then be both displayed and processed by a machine?

The graph as a solution approach

It can be noted that the information given in Table 1 is characterized by three different aspects: the row, the column, and the actual value in the cell. These three aspects must be aligned with each other. This is done via identifiers of the corresponding entities; one for the row, one for the header of the respective column, and one for the value in the cell. Transferred to the statement "Hydraulic pump WT-P-001 belongs to product line Turbo Pelot", this means that we have three entities, the subject "Hydraulic pump WT-P-001", the predicate "belongs to product line" and the object "Turbo Pelot". Consequently, our statement can be represented as a triple of the three entities subject-predicate-object. If we consider the first four columns of row 1 from Table 1, we can represent the information given there as a set of triples (see Table 2).

Subject	Predicate	Object
Hydraulic Pump	has Product Name	WT-P-001
WT-P-001	belongs To Product Line	Turbo Pelot
WT-P-001	is Installed In	PANL-01

Table 2: Statements in the form of triples

For reasons of readability, we have represented the subject "Hydraulic Pump WT-P-001" in a somewhat simplified way. We will discuss this further below. For better illustration, it is useful to represent the information presented in Table 2 in the form of a directed graph (see Figure 1). The nodes of the graph represent the subject and the objects, and the edges correspond to the predicates.

Figure 1: Graph-based representation of triples

It becomes particularly interesting when we look at the other columns from Table 1. For example, we know that the plant in which our WT-P-001 hydraulic pump is installed is located in Munich and will reach the next maintenance cycle in July 2023. The task to be performed is the replacement of the pump head. And even the supplies are known. Our statements can therefore be extended as shown in Table 3.

Subject	Predicate	Object
PANL-01	is Located In	Munich
WT-P-001	has Next MI	MI 07/2023
MI 07/2023	requires Topic	Replacing pump head
Replacing pump head	requires Supply	Pump head 12

Table 3: Product knowledge supplemented by statements from a service perspective

The statements from Table 3 can also be represented as a graph, and this in turn can be combined with our first graph from Figure 1, resulting in Figure 2. Due to the so-called network effect, we can now integrate further information from different parts of the company such as production (e.g., product variants), marketing (e.g., special accessories), or service (e.g., condition monitoring), giving us, as we will see below, a powerful tool for drawing inferences.

Figure 2: The network effect: A combined graph

Making graphs machine-readable

What has been described informally in the course of this article so far must be formalized in order to actually be machine-readable and computable. So-called ontologies give us the necessary framework for this. By an ontology, we mean the formal representation of a specific domain. It defines which types of entities can exist in our domain and how these entities are related to each other.

Let us illustrate this with our pump example and the statement "The hydraulic pump WT-P-001 is installed in plant PANL-01". In the ontology, the classes and relations between possible representatives of the classes are defined first. That the predicate "isInstalledIn" exists between a hydraulic pump and a plant is defined by domain restrictions (see Table 4). As we can see, even the ontology is described in the form of subject-predicate-object.

Subject	Predicate	Object
HydraulicPump	isType	Class
Plant	isType	Class
isInstalledIn	isType	Predicate
HydraulicPump	isInstalledIn	Plant
isInstalledIn	hasDomain	HydraulicPump
isInstalledIn	hasRange	Plant

Table 4: Excerpt from the "pump" ontology

Concrete objects such as the hydraulic pump "WT-P-001" or the plant "PANL-01" can be described as instances of these classes. The "isInstalledIn" relation exists between both entities. This could be modeled as an inverse relation, so that "PANL-01 contains the hydraulic pump WT-P-001" is also valid (see Table 5). As a further interesting modeling option, so-called subclasses can be defined, e.g., "hydraulic pump" is a subclass of "pump".

Subject	Predicate	Object
WT-P-001	isType	HydraulicPump
PANL-01	isType	Plant
WT-P-001	isInstalledIn	PANL-01
PANL-01	containsProduct	WT-P-001

Table 5: Concrete objects as instances of classes

Equipped with these considerations, we complete the graphical representation of our pump example from Figure 2 with the class information. The concrete instances are represented by circles, the associated classes by rectangles (see Figure 3).

The ontology is superimposed like a net over our information. On the one hand, the originally implicit available information from our first tables is now explicit via direct statements. On the other hand, we can also make indirect statements based on the ontology, such as "Munich belongs to the DACH sales region". Semantic modeling thus offers a wide range of possibilities for finding out dependencies between different entities.

Figure 3: Statements about objects (incomplete)

Linked information

The discussion so far has not taken into account that our "pump knowledge" is stored (mostly historically!) in many different places in the company: product-related information possibly in a PLM system, technical documentation in a CCMS, and information about supplies in an SPC. Other information is often "buried" in Excel lists or resides on different drives. The challenge for our company, therefore, is to intelligently link the information needed for specific requirements and make it available centrally.

In Figure 4, we sketch a rudimentary IT architecture. There, we stick to our considerations about semantic modeling (ontology) and graph-based representation. To manage our information, we need a graph database that can process RDF data. Because our information is usually already available in "some" place in the company in various information silos, we need a conversion tool that transfers existing information to the graph database for defined use cases. We often resort to mapping, as the information we are dealing with is usually already structured or semi-structured (for unstructured information, classification and conversion can be done automatically, if necessary). For this conversion, it is important to emphasize that the goal is not to duplicate all information. The graph database may also contain what are called "proxy objects" for the actual resources in the original systems. An Inference & Query Engine gives access to the graph database to query the triples stored in RDF. Users access the content via one or more applications/interfaces.

Figure 4: Sketch of an IT architecture for knowledge graph-based applications

Conclusion and outlook

In this article, we have introduced semantic modeling using an example from mechanical engineering. Its basic concepts, such as triple and graph-like representation, make it possible to intelligently link and virtualize both structured and unstructured information and different information silos. It is not always necessary to reinvent the wheel. iiRDS, for example, already provides an existing ontology developed for technical information that can be seen as part of a knowledge graph. In addition, there are established taxonomies, thesauri, or ontologies that can be used as a starting point for semantic modeling. Internal company nomenclatures or metadata taxonomies also provide a good starting point. With the help of suitable software, company-specific ontologies or custom schemes can be developed.

Knowledge graphs incorporate very different types of information from different areas of the company and offer a 360-degree view of the company. Our example is based on how product knowledge can be linked to service knowledge. A conceivable application based on such a knowledge graph would be, for example, a “tech assistant app” or a “spare parts recommender”.

Overall, we find that knowledge graphs can help to continuously improve service. They provide answers to some of the central service strategy challenges in companies, such as increasing the first time fix rate or overall equipment effectiveness or decreasing mean maintenance downtime. In addition, knowledge graphs provide an excellent basis for concrete application scenarios such as diagnostic assistants or predictive maintenance. They can help receive feedback from service technicians and close the development-production-service cycle.

What SMEs wish tech writers knew

Technical documentation for a greener Europe

Will AI kill your CCMS? A tech writer’s crime scene investigation

Creating your brand as a tech writer

What SMEs wish tech writers knew

A practical guide to machine translation quality prediction

Misguided Guidance: When docs and UI don’t align

My experience with the TCTrainNet course

Intelligently linking information – with the help of knowledge graphs

As a start: Information in rows and columns

The graph as a solution approach

Making graphs machine-readable

Linked information

Conclusion and outlook

Further reading

What SMEs wish tech writers knew

Technical documentation for a greener Europe

Will AI kill your CCMS? A tech writer’s crime scene investigation

Creating your brand as a tech writer

What SMEs wish tech writers knew

A practical guide to machine translation quality prediction

Misguided Guidance: When docs and UI don’t align

My experience with the TCTrainNet course

Intelligently linking information – with the help of knowledge graphs

As a start: Information in rows and columns

The graph as a solution approach

Making graphs machine-readable

Linked information

Conclusion and outlook

Further reading

About the authors

tekom member login