Do your user instructions pass the test?

A closer look at how different usability methods reveal whether information truly works in practice

Text by Stephanie Schwenke

Inhaltsübersicht

Image: © deepblue4you/istockphoto.com

This article provides a summary of Use-Lab’s assessment methods presented at the IUNTC meeting in February 2026. Stephanie Schwenke presented several methods that Use-Lab frequently applies in the assessments of information products, illustrating how to make the most of an evaluation and generate meaningful results. Based in Germany, Use-Lab GmbH is a consultancy for usability engineering for medical technology.

By Stephanie Schwenke

This article outlines the core methods of usability testing and takes a closer look at two examples: a scenario-based task involving patient information and a cloze test based on the instructions for a kitchen appliance.
  

Evaluation criteria

Information products must not only provide information but also help people find, understand, and confidently use content. Whether a user manual, patient information, or technical instructions actually work is therefore only evident when they are used correctly. Information products can be assessed in many different ways, depending on the objective of the evaluation.

Use-Lab uses a framework consisting of seven aspects for the assessment:

  1. Motivation: Is the user willing to read the document in the first place?
  2. Findability: Can relevant content be found (quickly)?
  3. Perceptibility: Is the content recognized as important?
  4. Reading comprehension: Can users understand words and sentences (or equivalent components of images)?
  5. Understandability: Can meaning be assigned to the words and sentences?
  6. Applicability: Can what has been read be put into practice?
  7. Comprehensiveness: Is all the necessary information included, both from the manufacturer’s as well as the user’s perspective?
      

Methodological approaches

One obvious method for usability testing is observing usersperform tasks with the product. During these tests, participants are asked to solve a specific task based on instructions, a table, or an illustration. This allows us to verify whether information is discoverable and can be put into practice.

Closely related to this is user training. A person familiarizes themselves with the product and informational materials and then trains someone else. Observers note which content is conveyed and whether safety-related or particularly important points are actually addressed.

Checklists enable a structured assessment even when the product itself is not available. They allow for the systematic comparison of regulatory requirements, internal standards, or linguistic criteria. Readability indices can be used as a supplement, if necessary. They provide a rough impression but are insufficient in technical contexts, where long or specific terms are often unavoidable.

Multiple-choice questions are only meaningful if the questions and answer options are constructed very precisely. Explaining in one’s own words reveals whether content has been truly understood. Scenario-based tasks and cloze tests are particularly well-suited for specifically highlighting individual quality aspects of a document.
  

Example 1: Scenario-based task on patient information

Instead of merely testing whether a statement has been read, this task assesses whether information is correctly applied in a specific context. This is particularly valuable for warnings or information on side effects, as risky situations do not need to be directly recreated with the product.

In this example, we examine patient information for a medication used to treat glaucoma and to manage elevated pressure following an intravitreal injection. The document contains information on usage, contraindications, side effects, and ways to mitigate these. For this task, we specifically focus on the latter section, mitigating side effects.

The scenario is built around a specific persona: Ronald has received an intravitreal injection, the medication was prescribed to him, and he has read that fatigue, weakness, or dizziness may occur. The issue is relevant to his daily life because he wants to drive to his daughter’s house and look after his granddaughter. This translates the information from the document into a real-life situation.

To ensure the task does not lead to medically problematic conclusions, it is specified that Ronald should adhere to the dosage schedule prescribed by the doctor. This prevents the test subject from interpreting an independent change in medication as a solution. Instead, the questionnaire asks when Ronald should refrain from driving and what he can do to reduce the risk of side effects.

It is expected that two types of answers will be derived from the document: First, Ronald should not drive if he feels weak, tired, or drowsy. Second, he should drink more fluids and eat potassium-rich foods. If test participants instead suggest that Ronald should adjust the dose himself or spread smaller doses throughout the day, this reveals a possible misunderstanding of the text. The method thus not only yields correct or incorrect answers but also indicates potential misinterpretations and phrasing that should be revised.

At the same time, the example allows conclusions to be drawn about the document’s structure. Since the relevant information is spread across two pages, it can be examined whether the page break or the arrangement of the content contributes to important details being overlooked. Scenario-based tasks thus assess language, structure, and application simultaneously, depending on how they are phrased and which document sections are in focus.
  

Example 2: Cloze test with a food processor manual

For this assessment, every sixth or seventh word in a selected section is removed. Participants fill in the gaps based solely on the remaining context. It is not the person who is evaluated, but the document: The easier it is to deduce the missing words, the more robust the word choice and sentence structure are.

In the example, the test is applied to an instruction manual for a food processor. A coherent prose text is deliberately chosen rather than a list or isolated warnings, because only a continuous text provides sufficient context for reliable inferences. After selecting the passage, the words to be omitted are systematically determined. The number of gaps must be chosen so that the task is challenging but not frustrating.

The evaluation is nuanced. Answers are not mechanically counted as right or wrong. Rather, the assessment focuses on whether an alternative word choice makes sense in terms of content. Especially where unexpected terms are frequently used, it is worth asking: Was the headline unclear, the sentence too convoluted, or the technical term not sufficiently contextualized? In this way, not only individual problems become apparent, but also structural weaknesses in a text.

In the example discussed, for instance, the question arises as to whether a function-oriented headline such as “Use and Cleaning of the Cutting Unit” would be more understandable than a brief designation such as “Blade Unit”. The cloze test can thus provide insights for improvements on multiple levels – from word choice to the contextual framing of entire sections.
  

Why multiple methods make sense

The combination of different methods follows the principle of method triangulation. A document is examined from multiple perspectives because each method has different strengths. Scenario-based tasks demonstrate how information is applied in realistic situations. Cloze tests primarily assess linguistic clarity. Tasks involving the product or user training sessions clarify whether information is actually usable in a practical context.

Two guidelines are useful for selecting test items: First, different forms of presentation should be considered, such as warnings, tables, graphics, lists, and step-by-step instructions. Second, a risk-based approach can be taken. In this case, the focus is on content where misinterpretations or misuse would have particularly serious consequences.
  

Conclusion

Evaluating information products requires a multidimensional perspective. What matters is not only whether the content is factually accurate, but also whether it is engaging, easy to find, understandable, and helps users act with confidence. The methods described here show how these aspects can be systematically assessed. It is precisely when these methods are combined that a reliable picture emerges of where a document is already effective and where revisions are needed.