markdown2: A fast and complete implementation of Markdown in Python

Overview

Markdown is a light text markup format and a processor to convert that to HTML. The originator describes it as follows:

Markdown is a text-to-HTML conversion tool for web writers. Markdown allows you to write using an easy-to-read, easy-to-write plain text format, then convert it to structurally valid XHTML (or HTML).

-- http://daringfireball.net/projects/markdown/

This (markdown2) is a fast and complete Python implementation of Markdown. It was written to closely match the behaviour of the original Perl-implemented Markdown.pl. Markdown2 also comes with a number of extensions (called "extras") for things like syntax coloring, tables, header-ids. See the "Extra Syntax" section below. "markdown2" supports all Python versions 2.6+ or 3.3+ (and pypy and jython, though I don't frequently test those).

There is another Python markdown.py. However, at least at the time this project was started, markdown2.py was faster (see the Performance Notes) and, to my knowledge, more correct (see Testing Notes). That was a while ago though, so you shouldn't discount Python-markdown from your consideration.

Follow @trentmick for updates to python-markdown2.

Travis-ci.org test status: Build Status

Install

To install it in your Python installation run one of the following:

pip install markdown2
pypm install markdown2      # if you use ActivePython (activestate.com/activepython)
easy_install markdown2      # if this is the best you have
python setup.py install

However, everything you need to run this is in "lib/markdown2.py". If it is easier for you, you can just copy that file to somewhere on your PythonPath (to use as a module) or executable path (to use as a script).

Quick Usage

As a module:

>>> import markdown2
>>> markdown2.markdown("*boo!*")  # or use `html = markdown_path(PATH)`
u'<p><em>boo!</em></p>\n'

>>> from markdown2 import Markdown
>>> markdowner = Markdown()
>>> markdowner.convert("*boo!*")
u'<p><em>boo!</em></p>\n'
>>> markdowner.convert("**boom!**")
u'<p><strong>boom!</strong></p>\n'

As a script (CLI):

$ python markdown2.py foo.md > foo.html

or

$ python -m markdown2 foo.md > foo.html

I think pip-based installation will enable this as well:

$ markdown2 foo.md > foo.html

See the project wiki, lib/markdown2.py docstrings and/or python markdown2.py --help for more details.

Extra Syntax (aka extensions)

Many Markdown processors include support for additional optional syntax (often called "extensions") and markdown2 is no exception. With markdown2 these are called "extras". Using the "footnotes" extra as an example, here is how you use an extra ... as a module:

$ python markdown2.py --extras footnotes foo.md > foo.html

as a script:

>>> import markdown2
>>> markdown2.markdown("*boo!*", extras=["footnotes"])
u'<p><em>boo!</em></p>\n'

There are a number of currently implemented extras for tables, footnotes, syntax coloring of <pre>-blocks, auto-linking patterns, table of contents, Smarty Pants (for fancy quotes, dashes, etc.) and more. See the Extras wiki page for full details.

Project

The python-markdown2 project lives at https://github.com/trentm/python-markdown2/. (Note: On Mar 6, 2011 this project was moved from Google Code to here on Github.) See also, markdown2 on the Python Package Index (PyPI).

The change log: https://github.com/trentm/python-markdown2/blob/master/CHANGES.md

To report a bug: https://github.com/trentm/python-markdown2/issues

Contributing

We welcome pull requests from the community. Please take a look at the TODO for opportunities to help this project. For those wishing to submit a pull request to python-markdown2 please ensure it fulfills the following requirements:

  • It must pass PEP8.
  • It must include relevant test coverage.
  • Bug fixes must include a regression test that exercises the bug.
  • The entire test suite must pass.
  • The README and/or docs are updated accordingly.

Test Suite

This markdown implementation passes a fairly extensive test suite. To run it:

make test

The crux of the test suite is a number of "cases" directories -- each with a set of matching .text (input) and .html (expected output) files. These are:

tm-cases/                   Tests authored for python-markdown2 (tm=="Trent Mick")
markdowntest-cases/         Tests from the 3rd-party MarkdownTest package
php-markdown-cases/         Tests from the 3rd-party MDTest package
php-markdown-extra-cases/   Tests also from MDTest package

See the Testing Notes wiki page for full details.

Comments
  • Fix for

    Fix for "Footnote titles not translatable #244"

    Resolves #244

    There are a few aesthetic options about how one might try to allow different language notes in footnotes in markdown. I find it a bit hard to allow this in a nice way from direct command line usage, so I've chosen an implementation through in-interpreter use.

    The default footnote title is "Jump back to footnote %d in the text." To change, I propose the following.

    markdowner = markdown2.Markdown(extras=["footnotes"])
    markdowner.footnote_title = "Retournez vers footnote %d. "
    markdowner.convert(text)
    

    The extra information is passed as an attribute of the markdown2.Markdown class used for conversion. If there is a format problem, then I revert to the default.

    Note: I'm having trouble running tests on my own [I get 22 failures on the current master branch], so I expect to need to clean up a few things first.

    opened by davidlowryduda 14
  • TOC output when invoking markdown2 as a command through CLI

    TOC output when invoking markdown2 as a command through CLI

    Hey I have installed markdown2 using PIP in one of my Docker images and I can invoke it like this: markdown2 input.md > output.html, but I am not able to use markdown2 as a command on Windows nor on WSL when installed through PIP for some reason.

    Is there any info on using the TOC extra through CLI? I ran the same script with the TOC extra in my CI pipeline with the Docker image where I am able to invoke markdown2 as a command and pipe the output to a file and it didn't get embedded into the file, so I suppose I can only wrap the invocation in a Python script and write the files myself?

    opened by TomasHubelbauer 13
  • markdown2 is slow

    markdown2 is slow

    Parsing the Markdown Syntax document 1000 times...
    Mistune: 12.9425s
    Misaka: 0.537176s
    Markdown: 47.7091s
    Markdown2: 80.5163s
    cMarkdown: 0.680664s
    

    http://lepture.com/en/2014/markdown-parsers-in-python

    opened by techtonik 13
  • Another Filter bypass leading to XSS

    Another Filter bypass leading to XSS

    On the latest release (2.3.8) a payload like this one can lead to xss and bypass safe_mode when set to true.

    <lol@/ //id="pwn"//onclick="alert(1)"//**abc**

    The Problem: I think its due to just bad regex's not detecting non alphanumeric tags.

    poc

    opened by TheGrandPew 12
  • Replace md5() with sha256() for hashing

    Replace md5() with sha256() for hashing

    For FIPS[0] related installation, the MD5 modules are unavailable, to the tune of md5() raising an exception. Using sha256() from the same hashlib passes the same make test and is now FIPS compliant.

    [0] http://csrc.nist.gov/groups/STM/cmvp/documents/140-1/140val-all.htm

    opened by shaunbrady 11
  • Python 3 support?

    Python 3 support?

    I've been a longtime user of python-markdown2 and like the way it just works. I'd like to start using it in a Python 3-based project, but it seems as if there is no goal of porting to Python 3 (I last checked it out a year or so ago -- at that time I did my own port).

    Would you like me to contribute the port I have made, or is there no real motivation to support Python 3 at this time?

    opened by nfd 11
  • Potential Extension

    Potential Extension

    I've been using markdown2 a lot in a recent project to convert a \LaTeX document generator into an online research environment. Combined with MathJax, Markdown2 with the footnotes extension is pretty much perfect. However, the one thing that Markdown can't do is number figures and tables. This is pretty much essential in research papers and is also a useful feature in other styles of documentation.

    Whilst I understand that there are way too many "standards" for Markdown, I wanted to add an extension to handle figure, table and equation numbering to Markdown. What I've done (in a relatively few lines of code) is add a preprocessor which generates generic "counters" which can be used to number equations. The syntax is as follows.

    When you first want to label an item (figure, table, image whatever), you add a markdown item

         [#counter_id](:item_tag)
    

    and when you want to refer to it in the text you use [:item_tag]. Here both counter_id and item_tag are "words" in the regex sense.

    During conversion, the first pass identifies all the labels and creates counters as required and increments them if they're already created. During the second pass, the counter integers are replaced in the text. In principle, each counter can have it's own css style associated with it. At present it doesn't hyperlink the references to the original item but that would be easy to add.

    The code is about 40 lines of Python.

    So the issue is: Is this more generally useful? There's almost no support for "exhibit numbering" in any Markdown converters -- which is why I had to write my own. I could certainly add this as an "extra" in markdown2 if there was general support for it but I don't want to add useless cruft to a project. Comments?

    opened by ewankirk 10
  • Fix code block indentation in lists

    Fix code block indentation in lists

    This is a fix for issue #276. There's a failure in converting markdown lists that have fenced code blocks in them, because the indentation of the fenced code block is not accounted for unless syntax highlighting is enabled.

    I also pulled out the code for substituting code blocks with a lexer into its own method so it is a little easier to read.

    opened by momja 9
  • Added header support for wiki tables

    Added header support for wiki tables

    Following the syntax used in Wikidot:

    ||~ Head1 ||~ Head2 ||
    ||  cell1 ||  cell2 ||
    

    Also added indenting to match the format found when using the tables extra.

    opened by ryanvilbrandt 9
  • meta-data feature?

    meta-data feature?

    Are there any plans to implement meta data feature (maybe like the one linked?). If not, would you accept a patch for that? And if yes what should the api look like?

    opened by slomo 9
  • Incorrect code block conversion

    Incorrect code block conversion

    Issue

    The code block

    ``` c const var = foo() ```

    is being converted to:

    <pre><code>&lt;div class="codehilite"&gt;
    &lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;const&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;var&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;
    &lt;/div&gt;
    </code></pre>
    

    and should be converted to:

    <div class="codehilite">
        <pre><span></span><code><span class="k">const</span><span class="w"> </span><span class="n">var</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">foo</span><span class="p">()</span><span class="w"></span>
        </code></pre>
        </div>
    

    Observation

    During the conversion, the input text passes through function _do_code_blocks(self, text) which hashes the <div class="codehilite"></div> and it's content and wraps them in <pre><code></code></pre>

    Output of _do_code_blocks(self, text):

    '<pre><code>md5-353f226db77851404d4ad30e2cb646b8\n</code></pre>\n'
    

    Output of _do_code_blocks(self, text) is passed as input through function _unescape_special_chars(self, text) and hash is converted to following output:

    <pre><code>&lt;div class="codehilite"&gt;
    &lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="k"&gt;const&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;var&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
    &lt;/code&gt;&lt;/pre&gt;
    &lt;/div&gt;
    </code></pre>
    

    Steps to reproduce:

    1. create markdown file and paste the text: ``` c const var = foo() ```

    2. python /lib/markdown2.py --extras fenced-code-blocks foo.md > foo.html

    Bug 
    opened by XDanielPaul 8
  • toc: Headings are not found after a html page-break

    toc: Headings are not found after a html page-break

    Example:

    # Introduction
    
    ...
    <!-- every heading after a page-break is missing in toc -->
    <div style="page-break-before: always"></div>
    
    ## Main Features
    ...
    
    Bug 
    opened by berndbenner 0
  • Extra for generating link preview

    Extra for generating link preview

    Instead of using a node.js based solution (e.g. link-preview-generator), I am wondering how I can generate link preview during the processing of markdown. There appears to modules (e.g. linkpreview) that perform the actual uplifting so maybe it is not too difficult to add an extra for this.

    Feature 
    opened by BoPeng 5
  • Clearer warning when not all tests are ran

    Clearer warning when not all tests are ran

    This PR fixes #458.

    It works by capturing and reading each line of each test runner's stderr in real time. If the line contains a warning it is saved to be re-printed later. Once all of the tests have finished, the warnings are printed to stdout along with the version of Python that raised the warning.

    opened by Crozzers 2
  • Clearer warning when not all tests are ran

    Clearer warning when not all tests are ran

    See

    https://github.com/trentm/python-markdown2/pull/455#issuecomment-1182130274

    https://github.com/trentm/python-markdown2/pull/455#issuecomment-1182538886

    Maybe we print warnings when pygments isn't installed locally at the end of the test run. Or do something to make it clear not all tests have ran.

    Priority 
    opened by nicholasserra 0
  • Refactoring: split into files

    Refactoring: split into files

    Hi there.

    First, thank you for your great work :heart:

    Following this discussion #382 , I wanted to try to make the extra part more flexible. I realized it's quite hard to follow the code because everything is packed in a huge file.

    This PR is a first step to refactor the code and to then make the extras more flexible/pluggable.

    Wolud you agree to split this code? I don't mind about the name of the files if you don't like this organization. But I think it would be nice to avoid having just one big pile of code.

    opened by Einenlum 8
  • pytest integration

    pytest integration

    This is "preliminary" pytest port of the testlib.py code, leveraging pytest and its ecosystem.

    It supports running all (unchanged) tests (tm-cases, markdowntest-cases, php-markdown-cases and php-markdown-extra-cases).

    pytest itself provides extensive support for code quality (coverage, unitest xml etc.).

    This change also integrates the workflow actions.

    opened by cav71 3
Owner
Trent Mick
Trent Mick
A markdown lexer and parser which gives the programmer atomic control over markdown parsing to html.

A markdown lexer and parser which gives the programmer atomic control over markdown parsing to html.

stonepresto 4 Aug 13, 2022
Mdformat is an opinionated Markdown formatter that can be used to enforce a consistent style in Markdown files

Mdformat is an opinionated Markdown formatter that can be used to enforce a consistent style in Markdown files. Mdformat is a Unix-style command-line tool as well as a Python library.

Executable Books 180 Jan 6, 2023
A fast yet powerful Python Markdown parser with renderers and plugins.

Mistune v2 A fast yet powerful Python Markdown parser with renderers and plugins. NOTE: This is the re-designed v2 of mistune. Check v1 branch for ear

Hsiaoming Yang 2.2k Jan 4, 2023
A fast, extensible and spec-compliant Markdown parser in pure Python.

mistletoe mistletoe is a Markdown parser in pure Python, designed to be fast, spec-compliant and fully customizable. Apart from being the fastest Comm

Mi Yu 546 Jan 1, 2023
A lightweight and fast-to-use Markdown document generator based on Python

A lightweight and fast-to-use Markdown document generator based on Python

快乐的老鼠宝宝 1 Jan 10, 2022
A Python implementation of John Gruber’s Markdown with Extension support.

Python-Markdown This is a Python implementation of John Gruber's Markdown. It is almost completely compliant with the reference implementation, though

Python-Markdown 3.1k Dec 30, 2022
A Python implementation of John Gruber’s Markdown with Extension support.

Python-Markdown This is a Python implementation of John Gruber's Markdown. It is almost completely compliant with the reference implementation, though

Python-Markdown 3.1k Dec 31, 2022
Static site generator that supports Markdown and reST syntax. Powered by Python.

Pelican Pelican is a static site generator, written in Python. Write content in reStructuredText or Markdown using your editor of choice Includes a si

Pelican dev team 11.3k Jan 5, 2023
Extensions for Python Markdown

PyMdown Extensions Extensions for Python Markdown. Documentation Extension documentation is found here: https://facelessuser.github.io/pymdown-extensi

Isaac Muse 685 Jan 1, 2023
Markdown parser, done right. 100% CommonMark support, extensions, syntax plugins & high speed. Now in Python!

markdown-it-py Markdown parser done right. Follows the CommonMark spec for baseline parsing Configurable syntax: you can add new rules and even replac

Executable Books 398 Dec 24, 2022
Lightweight Markdown dialect for Python desktop apps

Litemark is a lightweight Markdown dialect originally created to be the markup language for the Codegame Platform project. When you run litemark from the command line interface without any arguments, the Litemark Viewer opens and displays the rendered demo.

null 10 Apr 23, 2022
A markdown template manager for writing API docs in python.

DocsGen-py A markdown template manager for writing API docs in python. Contents Usage API Reference Usage You can install the latest commit of this re

Ethan Evans 1 May 10, 2022
Livemark is a static page generator that extends Markdown with interactive charts, tables, and more.

Livermark This software is in the early stages and is not well-tested Livemark is a static site generator that extends Markdown with interactive chart

Frictionless Data 86 Dec 25, 2022
Read a list in markdown and do something with it!

Markdown List Reader A simple tool for reading lists in markdown. Usage Begin by running the mdr.py file and input either a markdown string with the -

Esteban Garcia 3 Sep 13, 2021
Yuque2md - Offline download the markdown file and image from yuque

yuque2md 按照语雀知识库里的目录,导出语雀知识库中所有的markdown文档,并离线图片到本地 使用 安装 Python3.x clone 项目 下载依

JiaJianHuang 4 Oct 30, 2022
Convert HTML to Markdown-formatted text.

html2text html2text is a Python script that converts a page of HTML into clean, easy-to-read plain ASCII text. Better yet, that ASCII also happens to

Alireza Savand 1.3k Dec 31, 2022
Comprehensive Markdown plugin built for Django

Django MarkdownX Django MarkdownX is a comprehensive Markdown plugin built for Django, the renowned high-level Python web framework, with flexibility,

neutronX 740 Jan 8, 2023
Awesome Django Markdown Editor, supported for Bootstrap & Semantic-UI

martor Martor is a Markdown Editor plugin for Django, supported for Bootstrap & Semantic-UI. Features Live Preview Integrated with Ace Editor Supporte

null 659 Jan 4, 2023
A super simple script which uses the GitHub API to convert your markdown files to GitHub styled HTML site.

A super simple script which uses the GitHub API to convert your markdown files to GitHub styled HTML site.

Çalgan Aygün 213 Dec 22, 2022