As you probably know, last month I was presenting at Virtual Summit on Intelligent Agents and Chatbots organized by The Content Wrangler and Content Rules. We talked about how the concept of producing multiple output formats from a single source evolved into the idea of delivering contextually relevant content through various delivery channels, like chatbots or augmented reality apps.
You can watch the recording here.
Due to time constraints, we had to leave some questions unanswered, and I promised to address them in follow up posts. I’m beginning to keep my promise and answering the first portion of the questions.
“Are there any standards (like IEEE) for the kinds of metadata provided for information? Something that can be used as a basis for the context and entity questions? I know DITA, but it doesn’t stretch to that.”
There are standard metadata models that you may want to consider. One of the latest developments is intelligent information Request and Delivery Standard (iiRDS). You can read more about it here. From the iiRDS specification: “For retrieving information units dynamically according to the usage scenario and context, we can no longer deliver document-based documentation; we have to deliver a bulk of single topics. In the old-fashioned manual, readers would know from context to which product, target group, or lifecycle phase a section is related to. If we replace the document manual with single-topic delivery, we need metadata that helps create this context and also lets users/applications find, in a large amount of topics, the right topic for a particular usage scenario or user. Metadata provides context. Therefore, metadata needs to be part of the delivery, along with the content.”
iiRDS is based on RDF Schema (RDFS). What is important is that both RDFS and iiRDS are metadata models rather than formats. In other words, they define that your metadata should have information about product, its components, information types, etc. and specify the relationship between metadata properties, but they don’t define a specific syntax, like XML or anything else.
This means that there is nothing wrong with using DITA as a format (syntax) to represent the RDFS or iiRDS metadata model. For example, you could use DITA’s subjectScheme maps to implement your metadata model. But of course, the choice of the format entirely depends on your metadata model and goals you want to achieve.
Having this said, I’d like to emphasize that the basis for questions about the context and entities is not metadata itself. Metadata works on the content side. These questions occur on the chatbot side. The basis for these questions is a classification of intents and definition of entities.
“How to start the project to add the metadata to traditional way created information which is needed for maintenance instructions?”
Generally, your metadata needs to reflect user’s intents. This is how the chatbot will be able to find the information that addresses the user’s question. You can start with classifying intents that the user might have. For example, if it’s maintenance instructions, your users may want to know how often a scheduled maintenance needs to be done for different components, how to lubricate, replace, or inspect different components, and so on. Of course, the classification can be more complicated and have intents nested into each other or have all kinds of relationships between each other. For each intent, you also define parameters that you need to know to find the most relevant information. For example, if the intent is “lubrication”, you need to know at least the name of the component to be lubricated.
Once you know what users may ask and what you have in your content, you need your content to be tagged with metadata that specifies whether this piece of information is about scheduled or unscheduled maintenance, lubrication, inspection, or replacing the component, what component it describes, and so on.
The process of assigning metadata can be completely manual or automated with some human supervision. In our products, we are using both approaches. That is you can always manually add metadata, but you can ask one the metadata engine automatically analyze the text (we are using various natural language processing algorithms) and add metadata for you automatically (which you can always edit).
“Many companies have lots of information about maintenance instructions etc. they are not structured and have overlapping information. It must be a huge project to get started to analyze the present instructions. How should you get started the project?”
I’ve partially addressed this question in the previous answer so I’d like to refer to the part about non-structured and overlapping information. There are some tools that can at least partially automate the conversion to structured format (for example, you may want to take a look at one of our tools called ConverToo and help you identify potentially reusable content (check out terminology management tools, like Acrolinx or Congree), but there is no a magic button. To get to a substantial outcome, some efforts need to be done.
The real question is how you can manage these efforts in a way that will let you get the desired result while minimizing risks and maximizing the return. I would recommend to take a staged approach when at each stage you can get a workable product, assess the results, and adjust your strategy (and the financial investment) before you move on to the next stage.
For example, if you want to have a customer support chatbot:
“How are you approaching the task of designing the conversations – interviews with users, or internal brainstorming?”
It’s both. Like I said in the answers on the previous questions, you have to start with classifying intents that the user might have. You do it based on the actual requests you’ve been getting from your customers through all possible channels, including your support tickets, user forums, phone calls, emails, and so on. It’s very important to keep doing this work all the time, even after your chatbot is launched and refine the logic of the conversations if what you initially defined doesn’t work well.
In the next post, I’ll keep answering your questions. They will be about implementation, design, and tools.