Can lightweight markup languages be used for documentation?  Clip to Evernote

Over the last few weeks, I’ve been thinking about and using markup languages quite extensively. Not the markup languages that you normally associate with documentation — like HTML, XML, or even LaTeX. I’ve actually been doing some work with lightweight markup languages, specifically one called Markdown. My focus has been using Markdown to write articles and blog posts (including this one); you can read more about my adventures with Markdown here.

Why go lightweight?

One aspect of lightweight markup languages that I find useful is their simplicity. For the most part, at least. You do all of your writing in a text editor. The markup that these languages is what makes them lightweight. The syntax that they use is similar to wiki syntax — keyboard characters are used to define formatting. You can find some examples here.

On top of that, the processors for various lightweight markup languages can convert documents to higher-level forms of markup like DocBook XML, HTML, and LaTeX. You can transform DocBook and LaTeX output to Postscript, PDF, certain online help formats, or even UNIX man pages. I haven’t found a lightweight markup language that supports DITA output, though.

While languages like Markdown are fine for the simple uses I’ve put them to, can they be used for documentation? It really depends on the markup language, what documentation you’re developing, and how you want to present your documents.

Looking at AsciiDoc

Of all the lightweight markup languages that I’ve worked with — and there are a number that I haven’t touched — the one that’s best suited to documentation tasks is AsciiDoc. Two others, reStructuredText and Txt2Tags, also look suitable. But I haven’t worked with them enough to form any kind of opinion of their capabilities.

AsciiDoc is two things. First, it’s a human-readable formatting scheme. AsciiDoc markup is very similar to that used with a wiki — keyboard characters are used to specify and a Python script and a set of configuration files for transforming a document that is marked up to a number of different output formats. These formats include HTML and XHTML, UNIX man pages, DocBook, and LaTeX.

While AsciiDoc is really meant for shorter documents, some people — including its author — have written and output longer documents, too. The output doesn’t look too bad (here’s the AsciiDoc manual, which was first converted to DocBook XML and then to PDF), although with just about any lightweight markup language you may run into problems with widows and orphans in generated PDF files. You can, of course, control the look an feel of the output using Cascading Style Sheets, a DocBook customization layer, or your favourite LaTeX document class.

Output, output

As mentioned earlier, you can generate documents formatted with more complex markups from an AsciiDoc source. From there, you can output your documents into other formats. Out of the box, you’ll be stuck with the standard look and feel of the output formats — unless, of course, you have custom DocBook, LaTeX, or HTML stylesheets. You can also use such intermediate converters as docbook2odf to convert output files to another format over which you have a little more control. But that could be a lot more trouble than it’s worth.

You’ll notice, though, that AsciiDoc (and most other lightweight markup languages) don’t support direct output to RTF- or HTML-based online help formats. Your best bet is to use DocBook, which supports output to HTML Help (uncompiled), Javahelp, and WebHelp formats. Or, you can generate chunked HTML and import it into a tool like HTML Help Workshop or RoboHelp. Again, that could be a little more work than it’s worth.

Single sourcing

This is the big one. And, unfortunately, most lightweight markup languages fall short in this area. And, again, to do any meaningful single sourcing you’ll have to convert the marked up files to DocBook and add the markup for profiling (DocBook’s term for conditional text).

AsciiDoc enables you to combine individual sections into a single document. You don’t get the topic-based granularity of DITA, but you do get a bit more control over what goes into a document.

Conclusion

If your documentation needs are simple, and you have a low or non-existent budget, then a lightweight markup language might be worth investigating. If you’re doing any sort of single sourcing or component-based writing, then I’d consider something else. The better lightweight markup languages are good for what I call monolithic single sourcing: creating multiple outputs of a large document, where you split the document out into manuals for different levels of user experience and different operating systems, as well as into multiple formats.

That said, I’ll continue to using Markdown to write articles and blog posts — especially when I’m offline and want to knock off a few quick posts. But for writing documentation, I’ll stick with the industry-standard tools. There’s really no way a lightweight markup language can compete.

What’s your take on this? Feel free to leave a comment.

This work, unless otherwise expressly stated, is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

  • Benjamin Klum

    Hello Scott!

    Interesting article. As I’m a student and have to write seminar papers quite often I’ve also dealt with this topic.

    I’ve gone through several phases within the last years:
    1.) Writing in Word & Co.
    2.) Writing in LaTeX
    3.) Generating LaTeX based on database content with help of a programming language
    4.) Writing in Wiki Syntax (For this purpose I’ve used Stuart’s AsciiDoc and have written a LaTeX backend)
    5.) Writing in Docbook XML dialect
    6.) Writing in a custom XML dialect (Conversion to Docbook XML via custom XSLT script)

    None of the solutions seemed perfect to me. Some where too verbose, some too inflexible.

    Now I think that the best solution is to combine several approaches, in particular 4, 5 and 6.

    Solution 1: Docbook XML & Wiki Syntax
    * Docbook is very flexible, but verbose
    * Wiki Syntax is lightweight, but inflexible
    * Docbook could be used as base to define the document’s big structure.
    * Wiki Syntax could be used inline in Docbook XML tags where appropriate to create non-complex text passages with simple formatting, simple elements like lists etc.
    * A processor would be necessary to convert the wiki markup to Docbook XML, the result could be processed normally by Docbook XSL stylesheets.

    Solution 2: Custom XML & Wiki Syntax
    * Like solution 1, but using custom XML dialect and XSL stylesheet which converts to docbook.

    Just some thoughts.

    Regards
    Benjamin

  • Pingback:   Taking another look at AurigaDoc by Communications from DMN()

  • http://www.lunatech-research.com/ Peter Hilton

    Sure, if you have minimal ‘formatted output’ requirements. I use a wiki for project documentation – http://www.lunatech-research.com/archives/2006/12/04/wiki-wordprocessing

  • Tomek Kaczanowski

    Congrats Scott, you did a really nice job on all these formats and tools. Good, informative read – thanks!

    Personally I use AsciiDoc (writing book in it, and wouldn’t change it for anything else), and at work also reStructuredText – both really good. The magic begins when you use them also with other tools. In case of reStructuredText, I really enjoy to use it together with
    PlantUML – you can have a very nice UML diagrams in your documentation
    with minimal effort. And you can quickly generate nice presentations using AsciiDoc + Slidy combo (see http://kaczanowscy.pl/tomek/2011-09/nice-presentations-in-no-time-with-asciidoc-and-slidy for more information).

    In short – markup languages are great. It is much easier to update a wiki-like document then to change something in Word (at least for developers). Recently my team managed to gather all documentation in one place (in fact we build it on CI – see http://kaczanowscy.pl/tomek/2011-12/gather-all-development-documentation-in-one-place) so it is easily accessible (and always the latest version).


    Cheers,
    Tomek Kaczanowski