Last month I was presenting at the Content Management Strategies/DITA North America conference in Chicago. The presentation was about a possibility to use DITA to automatically generate requirements management documentation. Since we’ve got quite a lot of interest at the conference, I thought that it would make sense to write an essay about what we’ve done.
As a company that develops software (we develop a DITA content management system called DITAToo) and provides implementation services (such as legacy content conversion, publishing stylesheets development, training, etc.), we have to write a lot of requirements documents both for ourselves and for our customers. The documents which we write, write include software requirements specifications (SRS) for developers, sales proposals for customers, test cases for our QA team, cost estimate and project plan for the management, and so on.
As you can imagine, there is a lot of reuse. A typical process that we used up until now looked like this:
Imagine now what efforts are required if we decide to take a requirement out of the scope, if a new requirement is added, or the scope of an existing requirement is changed. Worse yet, in big projects, we usually have various dependencies between requirements. For example, Requirement B can be implemented only after Requirement A is delivered. Suppose that you decide to take Requirement A out of the scope. Now you have to remove Requirement B from the scope too, redo the cost calculation, update the SRS, test cases, proposal, and possibly other related documents too.
A few months ago, Michael, our Head of Implementation Services and myself were working on a quite big project. It was late evening, and we were estimating the costs. We did a lot of copy-paste work from Word to Excel, and when the cost seemed to be ready, Michael said: “Hold on a second, something is wrong with this figure. The total cost can’t be like what we’ve got.” We began to double check and found that one of the requirements just wasn’t copied from the SRS. Worse yet, it wasn’t included into the test cases document either.
So not only could we get the wrong estimate. The missing requirement would never be tested because from the perspective of our QA guys, it didn’t exist.
That was the point when we realized that enough is enough, and we have to find a fundamental solution.
First of all, we’ve decided that we want to utilize a great potential of DITA as a structured format. Once you have structured content, you can manipulate virtually any piece of content and, most importantly, automate this process.
We’ve realized that we need three types of DITA topics:
Simply put, each DITA topic represents a requirement. Using the @props attribute, you can specify the estimate in hours and who is going to work on this requirement. The @id attribute allows you to assign an ID to the requirement for reference purposes. With the @product attribute, you can define in which project phase the requirement should be implemented.
The screenshot below shows a requirement that has several sub-requirements. Each sub-requirement is estimated separately and has its own ID. You may notice that there are three paragraphs that are marked up as separate requirements. It’s done mainly for QA purposes. We want each of these statements to be tested. So while no separate estimate is required here, in the test cases document, each of these requirements should become a separate test.
Because we wanted the cost estimate to be calculated automatically, we had to create a separate topic with all our rates defined. The rates topic is a regular topic that includes a very simple table. You can see an example of such a topic on the screenshot below (the numbers are fictitious).
A project plan describes the phases (stages, bundles, or whatever you call them) into which the project is split. Remember how the @product attribute is used in the example of a requirement topic above? The number to which @product is set correspond to the number of the project phase defined in the project plan table.
Once the requirements are written, they have to be assembled into various documents, including SRS, proposal, detailed cost estimate, test cases, and so on. Because each type of a document has its own structure, we came up with the concept of “document genres”. Simply put, document genre to a document is like DITA information type to a topic. In the same way as the information type determines the structure of the topic, the document genre defines the structure of the entire document.
We’ve designed an XML-based format that allowed us to specify the structure of different document genres. For example, we’ve defined that a proposal will have introduction, supported scenarios, functional requirements, software requirements, cost estimate, delivery schedule, and payment terms.
Obviously, although all proposals have the same components and the identical structure, the contents of each component will be different for each proposal. While document genres are project independent, the contents of the components within the document genre is project-specific.
To define this project-specific contents of each component, we’ve introduced the term “subject”. Subject might be the same as project, or, in bigger projects, it might represent a certain subject area within the same project.
The figures below illustrate how document genres relate to subjects. These are the sample structures that show the components included into two genres: proposal and SRS:
This is diagram shows how the contents of these components is defined for a particular subject:
These are the same components with the contents defined for a different subject:
A particular document (for example, a particular proposal) is in the intersection of the document genre and subject.
After we defined topics, document genres, and subjects, we did some coding (mostly XSL) and ended up with a tool that takes individual topics (requirements), assembles them into documents we need based on the structure defined for the appropriate document genre and subject, and automatically generates tables with cost calculation and project phases.
This is, for example, how the cost estimate looks in the PDF of a proposal (the numbers are fictitious):
And this is how the same cost estimate is represented in a cost estimate document intended for our internal use (the numbers are fictitious). As you can see, this one includes much more details, all of them were retrieved from the topics metadata:
This is how a table that outlines the project phases, the scope of each phase, and the cost of each phase is generated (the numbers are fictitious):
As you can see, all this information is a result of automatic aggregation. None of these tables were manually created. If anything in the project changes, like a requirement is removed from the scope, or a new requirement is added, or the scope of a particular requirement is changed, we just need to click a button, and the whole set of the requirements documents will be regenerated.
Better yet, we can now get an automatically generated report about possible conflicts that might arise when requirements are changed. For example, if a requirement is taken out of the scope, the report will give us a list of all dependent requirements so we can make an informative decision.
We’ve been using this solution for a few months by now, but we are getting substantial benefits already: