Markdown has become the go-to lightweight markup language for a variety of tasks—you’ll find it powering commenting systems, blogs, wikis, and static websites of all sorts. This widespread adoption has led to a number of advantages over other markup languages—namely, AsciiDoc and reStructuredText—when it comes to writing technical documentation:
Note: When I refer to “AsciiDoc,” I’m talking about the variation of the syntax compatible with the Asciidoctor toolchain (not the older Python implementation).
Familiarity. Almost everyone in the technical writing community (developers, writers, editors, etc.) has some experience with Markdown. And while AsciiDoc and reStructuredText (to a lesser extent) are becoming more popular, they’re still foreign to many people.
Tooling. Markdown has a larger and more mature selection of tooling available, including linters (which AsciiDoc currently lacks altogether), standalone editors, and static site generators.
Performance. Markdown has a native implementation in many languages, while AsciiDoc (Ruby, JavaScript, and Java) and reStructuredText (Python) have fairly limited support. This can lead to performance issues, as is the case with AsciiDoc and Hugo.
One of the reasons that Markdown is so popular is its simple, easy-to-read syntax (that can be learned in as little as 10 minutes). Unfortunately, the simplicity of Markdown also means that it’s missing some important features that are available (to some extent) in both AsciiDoc and reStructuredText:
Support for content reuse. Perhaps the most important of Markdown’s missing features is the ability to include content from other files.
Management of tabular data. While this isn’t technically a “missing” feature, Markdown is notoriously bad at handling tables: they’re a pain to write, hard to maintain, and you’re limited to the most basic of layouts.
Extensibility. The inability to extend Markdown’s syntax often forces us to write HTML for fairly simple tasks such as applying classes to elements or embedding multimedia.
The most common solution to the above issues is to either use a more feature-rich “flavor” of Markdown (such as Python-Markdown) or a template language (such as Liquid). There are, however, a few drawbacks to doing so:
Depending on features that are tied to a particular flavor or static site generator (as is the case with many template languages), means that you’re effectively eliminating Markdown’s advantage in tooling: What good is having a wide selection of static site generators if switching between them requires significant reformatting? What good are linters if you have to use syntax that they don’t understand?
By deviating from Markdown’s “standard” syntax (the CommonMark Spec), you’re sacrificing Markdown’s simplicity and, by extension, its advantage in familiarity.
Another solution is to give up Markdown altogether: the sentiment that its flaws make it an inadequate choice for technical documentation is becoming increasingly popular. The problem here, aside from ignoring the strengths of Markdown, is that many of us aren’t in a position to make such a change. Any combination of lacking time, approval, or desire can make such a suggestion a non-starter.
But what if there was an option that addressed Markdown’s weaknesses without sacrificing its strengths?
Introducing “Markdata”
Markdata is an open-source, MIT-licensed Python library and command-line tool for managing “data” (essentially, any non-prose supplement that might appear in your markup — e.g., code, diagrams, tables, etc.) in Markdown files. There are a few driving principles behind this library:
Accept Markdown for what it is (and isn’t). Markdown, unlike many markup languages, is designed to be converted into a single format: (X)HTML. So, in order to get the most out of Markdown, you need to embrace the fact that you’ll need to know and use HTML—it’s currently the only portable way to extend Markdown’s syntax.
Avoid doing too much in your markup. Conditional logic, looping constructs, HTML, source code, and tables should not be directly written in your markup.
Single-source whenever possible. This is related to the previous point, but it’s worth re-stating: keeping your documentation DRY should be one of your highest priorities. All non-prose supplements (such as HTML snippets, code examples, and tabular data) should be written once and included elsewhere.
By adhering to the above principles, Markdata is able to extend Markdown in meaningful ways without deviating from CommonMark.
Getting Started
Markdata is available on PyPI (Python >= 3.6.0):
|
|
Markdata’s functionality is driven by directives, which are Markdown snippets that call Python functions. Instead of introducing new syntax, directives overload the existing containers for raw strings: code spans and code blocks.
The basic idea is that you specify the name of a Python function and its expected arguments within your markup:
|
|
And then you write the implementation in a separate Python source file:
|
|
You’ll typically store these directives in a directory close to your Markdown
content and then tell markdata
where to look:
|
|
In other words, Markdata acts as more of a “preprocessor” than an actual
implementation: you call the markdata
executable prior to (not instead of)
another Markdown library.
Now that you know the basics, let’s look at some real use cases.
Content reuse
The document
directive (one of Markdata’s two built-in directives) brings
content reuse to Markdown. To use this directive, you specify a path (relative
to the directive-containing file) and, optionally, a span of lines:
|
|
In the example above, we inserted lines 10 through 13 of my_file.py
into a
Python code block and included the entire contents of some_file.md
at the
end. To perform the insertions, simply call the markdata
executable on the
file:
|
|
Tables
table
is the second built-in directive. It allows you to write, edit, and
maintain your tables in YAML, JSON, or CSV files. For example, consider the
following YAML sequence:
|
|
To turn this into a table, we simply use the directive:
|
|
After calling markdata
executable, we get:
|
|
If the basic layout supported by table doesn’t fit your personal needs, you’re free to create custom directives that can produce any layout possible in HTML.
Multimedia
What if we want to include a YouTube video or Instagram post in our Markdown? With Markdata, we simply write a directive!
To embed a YouTube video, for example, we could do something like this:
|
|
And then, we can simply use the directive in our Markdown:
|
|
Of course, you could also add classes and attributes to allow for more precise styling and positioning.
Semantic meaning
Adding meaning to elements is another common use case for Markdata.
Let’s say that we want to be able to use Bootstrap-style alerts and add classes to our paragraphs. Once again, we write custom directives:
|
|
|
|
We can now use both of these as block-level directives:
|
|
Leveraging the power of Python
All of the examples so far have been pretty straightforward: we use Python as a middle ground between Markdown and HTML, allowing us to add features without straying from CommonMark syntax.
But, as you can probably imagine, the ability to write arbitrary Python means that we can do much more than text-based substitutions. Here are some ideas:
Create tables by making calls to APIs or databases at build time.
Use plots that update at build time using one of Python’s plotting libraries (e.g., Seaborn).
Include output from command-line tools by directly executing them.
Implement directives in another programming language that you call via Python (e.g.,
execute{'runtime': 'node', 'path': 'script.js'}
).… and more!
Conclusion
Markdata takes a new approach to an old problem: it allows us to extend Markdown without breaking away from CommonMark. It also allows us to move most template-related logic out of our markup source, making our documentation easier to maintain and less dependent on specific template languages.
The downsides are that it requires familiarity with Python and an additional build step.
Markdata is still in early development and I’d love to hear any thoughts or suggestions. Feel free to open an issue at the GitHub repository!