Media-neutral data management and a valid data model is a topic that is on the agenda in many technical writing departments right now. Here is what really matters in practice, beyond marketing claims and tool promises.
Media-neutral data management sounds like a technical detail. Like something you leave to IT. Like one of those questions you clarify eventually, once the more urgent topic is dealt with.
That is a mistake that gets expensive later.
Over the past years I have seen enough technical writing departments and documentation units from the inside to formulate a thesis I would rather not water down here: the decision about how you hold your content is the decision about whether your documentation project scales in five years or collapses in on itself. Everything else — tool selection, publication channels, translation budget — follows from this one decision.
The pattern I keep running into
A mid-sized mechanical engineering company produces twenty product variants. For each variant there is an operating manual. The manual exists in PDF, in Word, and increasingly in HTML for the web documentation. Some customers additionally require an XML package compliant with iiRDS.
Each variant is maintained separately. In Word. With copy-paste from the master document. Every product change triggers twenty editorial processes. The writing department is permanently chasing the product.
The company believes it has an editorial problem. And it looks for more staff, faster writers, better Word templates.
The problem lies in the data management, not in the writing.
In a situation like that I once counted: of the eight hours a writer worked each day, she spent almost four maintaining identical or near-identical content in different documents. Not because she was inefficient — but because the structure left her no other choice. That is the normal state in companies that have never fundamentally questioned their data management.
What media-neutral data management actually means
Media-neutral data management means: a piece of content is captured exactly once, stored in a structured way, and from there transferred automatically into every required target medium. PDF, HTML, mobile, print, XML, chatbot input — whatever comes along tomorrow.
The content does not know its output format. It knows its meaning.
That has two immediate consequences.
First: a product change leads to exactly one editorial change. Not twenty. The system handles publication into all channels.
Second: new channels do not cost a new production run. When a customer demands iiRDS-compliant data tomorrow, or an app needs a mobile output format, you add an output route — not twenty manuals.
Sounds like a given. In practice it is the exception.
Why is that? Well, because media-neutral data management starts out uncomfortably. At the outset it demands a decision before any tool is in play: how do I describe my content? What is the most common element that all my documents share? That is more uncomfortable than buying a system, uploading existing Word documents into it, and hoping it somehow works. It doesn’t.
Why „media-neutral“ does not hold without a valid data model
It is not enough to dump content into a CCMS and pat yourself on the back. Anyone who wants to hold content media-neutrally needs a data model that describes the content. And precisely enough that a system knows what it may do with which content block.
Three layers have to be modelled for this.
Functional metadata. What kind of content is this? A warning? An action step? A specification? A spare-parts reference? Without this distinction the system cannot make any sensible statement about output formats. A safety warning has to be treated differently from an advertising text — legally and typographically. Anyone who does not anchor this in the model is building on sand.
Technical metadata. Who created the block, when, in which version, with which release? Where does it live? Who may change it? Without this trail you have no audit security. And without audit security no robust technical documentation in the sense of the relevant directives. Let’s get to the unpleasant part: in an incident that ends up in court, the question will be which version the safety instruction had at the time of delivery. If you cannot prove that without gaps, you have a problem that reaches well beyond the documentation department.
Editorial metadata. Which product does the block belong to? In what language does it exist? To which chapter, which module, which target audience? Without this information, „reusable“ is a claim that no publication run will honour. Reuse only works when the system knows what to deliver where. Otherwise you have a well-meaning database from which people still copy and paste by hand.
Anyone who leaves out one of these three layers does not have a valid model. A file store with keywords is something other than a data model.
The most common objection — and why it doesn’t hold
„That’s too much effort. We’re only a mid-sized company.“
The objection is understandable and almost always wrong. The effort does not lie in building the model. It lies in the non-decision. Anyone who maintains twenty operating manuals in parallel for five years has already paid for the data model — except that they get nothing for it but maintenance.
The honest comparison is: a one-off model build versus ongoing manual maintenance in redundancy. The comparison comes out clearly, but only when you calculate it honestly.
There is a second objection I hear regularly: „We’ve always done it this way, and it has worked.“ That’s true. It worked as long as the product range was manageable, the translation requirements stayed limited, and no customer demanded iiRDS packages. As soon as one of these three variables changes — and in growing companies they always do — it no longer works. Then you stand in front of a mountain of legacy documents whose migration costs months, while the product business does not wait.
The question that has to come first
Anyone starting a documentation project, or replacing an existing system, should not put a tool question at the beginning. The tool question is the wrong question, and it comes too early.
The first question is: what does our data model look like? What content types do we have? What metadata do we need so the model holds — today and in five years?
Only once that answer is in place is the tool selection a technical exercise. Before that, it is an expensive bet.
I say this reluctantly so pointedly, but most of the failed CCMS implementations I have seen up close failed because of exactly this order. The tool was decided before the model existed. Afterwards they tried to adapt the model to the tool. The result is predictable: the system maps the old processes digitally without improving them. You have a new interface for the old problem.
What this means in practice
Building a valid data model need not be months of fundamental work. I recommend a pragmatic entry point: take five of your most common document types and list which content blocks within them are regularly identical or very similar. Those are your candidates for reuse. Then list which attributes you need to identify that block unambiguously: product affiliation, language version, release status, scope of validity. That is your first data model.
It is not perfect. It will evolve. But it is a foundation you can build on. And it shows you which tool requirements you actually have, before you nod along in a vendor demo and sign a contract.
Plan big, start small. Nowhere does that apply more than here.
Media neutrality and translations
One aspect that comes up too rarely in this discussion: the leverage in translation. If your documentation has to be rendered into several languages — and in mechanical engineering, medical technology, and electrical engineering that is almost always the case — the damage from redundant data management multiplies immediately.
Twenty variants mean, with three target languages, sixty translation projects. When a safety instruction changes, you pay for the translation of that instruction up to sixty times. With media-neutral data management and a working translation-memory system you pay for it once — and on the next run the system recognises that the source text has not changed and supplies the existing translation.
That is immediately measurable savings potential. And it is an argument that management understands too.
What a migration means — and when it is worthwhile
At some point the question comes up: what do we do with the existing documents? You may have a hundred, maybe five hundred manuals in the old format. Full migration sounds like a clearly defined project. In practice it is one of the hardest decisions in the entire digitalisation undertaking.
My recommendation: don’t migrate for migration’s sake. For each item of legacy content, ask: is this content current, validated, and still actively used? If yes, migration is worthwhile. If not, leave it out for now. A discontinued product line does not need migrated documents.
Where translations are involved, the migration decision is usually clear-cut: the cost of re-translation, which arises when you re-capture content instead of migrating it, is greater than the migration effort, provided the source texts are stable. Use existing translations. The translation-memory assets held by your translation agency or in your TMS are capital. Anyone who throws that away and starts over pays twice.
And if you migrate: be honest. If a piece of information was already wrong in Word, it is no more correct in the CCMS. Take the time to check and clean up content during migration. That sounds like extra effort. It is less effort than the alternative — maintaining bad content in a system that was actually built for quality.
What you can do from here
If you have read this article this far, you are probably in a situation where either a digitalisation project is coming up or you are realising that your existing data management is reaching its limits.
The concrete first step is not to call a CCMS vendor. It is to describe your own documentation landscape: how many document types do you have? Which content appears regularly in several documents? Which output formats are demanded today, and which could be added in three years?
This description is the basis for every valid data model. You can work it out internally, without external expertise. And it is the first thing any serious consultant or vendor will ask of you — or should ask of you. Anyone who starts straight away with a tool demo, without having asked these questions, has interests other than yours.
This article is part of a series on data modelling in technical communication. Subsequent parts describe modelling decisions in detail, migration patterns from Word and FrameMaker stocks, and the role of standards such as DITA and iiRDS.
Further standards and industry information are available from tekom — the German professional association for technical communication.
You can find more on concrete real-world cases in our article series on Artificial Intelligence and Technical Documentation.