A fast yet powerful Python Markdown parser with renderers and plugins.

Overview

Mistune v2

A fast yet powerful Python Markdown parser with renderers and plugins.

Coverage

NOTE: This is the re-designed v2 of mistune. Check v1 branch for earlier code.

Using old Mistune? Checkout docs: https://mistune.readthedocs.io/en/v0.8.4/

Sponsors

Mistune is sponsored by Typlog, a blogging and podcast hosting platform, simple yet powerful. Write in Markdown.

Support Me via GitHub Sponsors.

Install

To install v2 of mistune:

$ pip install mistune==2.0.0a6

Overview

Convert Markdown to HTML with ease:

import mistune

mistune.html(your_markdown_text)

Security Reporting

If you found security bugs, please do not send a public issue or patch. You can send me email at [email protected]. Attachment with patch is welcome. My PGP Key fingerprint is:

72F8 E895 A70C EBDF 4F2A DFE0 7E55 E3E0 118B 2B4C

Or, you can use the Tidelift security contact. Tidelift will coordinate the fix and disclosure.

License

Mistune is licensed under BSD. Please see LICENSE for licensing details.

Comments
  • Line under quote is not considered part of quote

    Line under quote is not considered part of quote

    Apologies if this just an issue on my side. But I found that the follow two lines would not be joined into the same quote, when passed to my custom renderer:

    > first quote line
    second line underneath
    

    I assumed they would both be passed together to the block_quote function.

    bug 
    opened by makeworld-the-better-one 21
  • default_features -> default_rules

    default_features -> default_rules

    Hi there !

    Are you on a strict 'no backward compat' (as the version number is < 1.x) ?

    I guess we miht try to reintroduce default_features as a class property that print a warning and maps to default_rules ?

    Just saying that as all our CI test are failing since 0,5 release, but not a big deal we can update and.or pin mistune version.

    Thanks !

    opened by Carreau 18
  • Nested HTML inside block_html is escaped when escape=False, parse_block_html=True

    Nested HTML inside block_html is escaped when escape=False, parse_block_html=True

    Normally with escape=False, nested HTML block is corectly not escaped:

    >>> print markdown('<div id="special-part"><div class="subsection">text</div></div>', escape=False)
    <div id="special-part"><div class="subsection">text</div></div>
    

    But when I add parse_block_html=True, only out-most element is not escaped and the rest is escaped:

    >>> print markdown('<div id="special-part"><div class="subsection">text</div></div>', escape=False, parse_block_html=True)
    <div id="special-part">&lt;div class="subsection"&gt;text&lt;/div&gt;</div>
    
    opened by tdivis 17
  • Implement text renderer.

    Implement text renderer.

    For rendering of markdown documents in text format.

    The concrete use case is the ability to render templated markdown documents directly within a terminal environment.

    opened by csadorf 12
  • Various Extension System Problems

    Various Extension System Problems

    I just looked at how the grammar is extended and in it's current form it's not really usable. There are a few problems with the way the inline lexer is supposed to modified alone:

    • right now the example modifies default_rules which is a class attribute and as such shared across all other lexers as well.
    • because the text rule is greedy it needs to be modified whenever another rule is inserted but it is impossible for a plugin to do so because it cannot now which sequences indicate the start of a rule. The current example only works by chance.

    I think the current design does not really suit itself to extension well but I'm not sure how to fix it without breaking what's already there.

    opened by mitsuhiko 12
  • 0.8.4 vs 2.0.0a4 performance

    0.8.4 vs 2.0.0a4 performance

    Congrats @lepture for this project! (I'm doing a benchmark of many Markdown to HTML libraries, and mistune seems to be the best!)

    I used timeit with the same Markdown document, and:

    • version 0.8.4: mistune.markdown(s) took 4ms on average
    • version 2.0.0a4: mistune.markdown(s) took 13ms on average (for the same document)

    What could be the reason of 2.0.0a4 be 3x slower than version 0.8.4?

    Can I use 0.8.4 for my current project, is it still stable?

    s = 'my_sample_document'
    import timeit
    print(timeit.timeit("markdown(s)", setup="from mistune import markdown;from __main__ import s", number=100)/100)
    

    PS: I also found this article, it's linked to this topic: https://getnikola.com/blog/markdown-can-affect-performance.html

    opened by josephernest 11
  • Parse line numbers

    Parse line numbers

    Is it possible to retrieve information about the line number of block-level elements in the AST? We are looking into using Mistune for a project and this would be quite helpful.

    E.g., running:

    markdown = mistune.create_markdown(renderer=mistune.AstRenderer())
    markdown("# Heading 1\n\n## Heading2")
    

    to return something like

    [{'type': 'heading',
      'children': [{'type': 'text', 'text': 'Heading 1'}],
      'level': 1,
      'lineno': 0},
     {'type': 'heading',
      'children': [{'type': 'text', 'text': 'Heading2'}],
      'level': 2,
      'lineno': 2}]
    

    If it's not possible currently, is it something that would be doable w/ a PR?

    feature request 
    opened by choldgraf 11
  • Provide context for renderer

    Provide context for renderer

    Great work on this; it is indeed fast!

    Would you consider a change something like the prototype in this pull request? It provides the inline_lexer to the renderer method and sets a couple extra values that can be used for more advanced rendering.

    Of course, a full implementation would change all the renderer methods to accept the lexer as an argument and would be breaking backwards compatibility, but it is a very new project so maybe it is in time? If so, I'm willing to make the change and submit it.

    In the exact example usage I'm thinking of, I detect if an image is all alone or surrounded by other elements. Depending on its surroundings or lack thereof, I change the rendered CSS class. So for example, an image followed by text all in one paragraph might float the image left and flow the text around it.

    Maybe an image would better explain:

    sample

    opened by gholt 11
  • Mistune 2.0.0 release?

    Mistune 2.0.0 release?

    Hi,

    I have been testing markdown rendering in our archiver for GNU Mailman project, Hyperkitty for some time now using the alpha/rc version of mistune 2.0. While it works okay in our test environments, i was hoping that released stable versions rely on the non-pre-release dependencies.

    I was wondering if you had any plans for the 2.0 release for mistune? There seems to be no issues or PRs marked with the 2.0 milestone at the moment.

    opened by maxking 10
  • Support for Front Matter?

    Support for Front Matter?

    I have a blog with markdown files generated by Jekyll. They use Markdown files that has a metadata block on top that contains yaml. Is this something that mistune would be willing to support? Jekyll is a very common blog engine, so I think it would be useful to many!

    ---
    id: 29
    title: How this website was built
    date: 2005-12-20T00:00:36
    categories:
      - CSS
    ---
    Hi there early readers!
    
    opened by EmilStenstrom 10
  • Traceback: 'DocPageRenderer' object has no attribute 'rstrip

    Traceback: 'DocPageRenderer' object has no attribute 'rstrip

    I am getting a traceback with 0.3.1 (which works with 0.3.0):

    Traceback (most recent call last):
      File "website/crossbario/__init__.py", line 191, in <module>
        pages = DocPages('../crossbar.wiki')
      File "website/crossbario/__init__.py", line 153, in __init__
        self._renderer = mistune.Markdown(renderer = rend, inline = inline)
      File "mistune.py", line 899, in mistune.Markdown.__init__ (mistune.c:16104)
        inline = inline(renderer, **kwargs)
      File "mistune.py", line 502, in mistune.InlineLexer.__call__ (mistune.c:8497)
        return self.output(src)
      File "mistune.py", line 510, in mistune.InlineLexer.output (mistune.c:9059)
        src = src.rstrip('\n')
    AttributeError: 'DocPageRenderer' object has no attribute 'rstrip'
    make: *** [test] Error 1
    
    opened by oberstet 9
  • Weird behaviours in math plugin

    Weird behaviours in math plugin

    I was checking the math plugin and found out four incompatibilities with latex compilers (pdfLaTeX and XeLaTeX) and MathJax:

    • Block math ($$) requires a newline before and after content, effectively changing the delimiters to $$\n and \n$$.
    • Block math should be able to have text before and after it, in the same line.
    • Block math doesn't work with empty content ($$$$), the regex is done with .+ instead of .*.
    • Both math modes don't take care of escaped dollar signs (\$).

    So the following markdown:

    Dollar sign $\$$
    
    Empty $$$$
    
    No newline $$x$$
    

    Is compiled to:

    <p>Dollar sign <span class="math">\(\\)</span>$</p>
    <p>Empty <span class="math">\($\)</span>$</p>
    <p>No newline <span class="math">\($x\)</span>$</p>
    

    Whereas it should probably be:

    <p>Dollar sign <span class="math">\(\$\)</span></p>
    <p>Empty <div class="math">$$$$</div></p>
    <p>No newline <div class="math">$$x$$</div></p>
    

    For comparison, here are the latex and MathJax versions:

    Latex version

    Code:

    \documentclass{article}
    
    \begin{document}
        Dollar sign $\$$ \\
        Empty $$$$ \\
        No newline $$x$$
    \end{document}
    

    Output: image

    MathJax version

    Code:

    <html>
        <head>
            <script src="https://polyfill.io/v3/polyfill.min.js?features=es6"></script>
            <script id="MathJax-script" async src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js"></script>
        </head>
        <body>
            <p>Dollar sign \(\$\)</p>
            <p>Empty $$$$</p>
            <p>No newline $$x$$</p>
        </body>
    </html>
    

    Output: image

    opened by TiagodePAlves 2
  • type hints

    type hints

    Would you like a donation of inline type hints? I can send a PR to apply @fmigneault type hints from https://github.com/common-workflow-language/schema_salad/commit/47332275ae04540a4b71ccfa0ff160edbef01496#diff-b6fac26c3692bb3b814467d88fe4b15140151671dc85689c3d2eb7385e95afd8 to as inline type hints , if you would release a new 2.x with it

    opened by mr-c 1
  • Support a minimum level requirement in the TOC directive

    Support a minimum level requirement in the TOC directive

    Thanks for this awesome library! :heart:

    Motivation

    I'm using mistune to parse documents that always have one level 1 heading. For example, the document might look like this:

    # mistune
    
    A fast yet powerful Python Markdown parser with renderers and plugins.
    
    ## Sponsors
    
    ...
    
    ## Install
    
    ...
    

    The generated table of contents always includes the level 1 heading, which is redundant since it is used as a page title, and is the only level 1 heading.

    Suggestion

    I think a new keyword argument on the directive's class and/or an option on the directive could be helpful. If, for example, the minimum is 2, then only ## to ###### headings would be included in the table of contents.

    opened by spenserblack 2
  • External plugins guidance and ecosystem

    External plugins guidance and ecosystem

    Mistune users are able to write there own plugins, and when they do this locally, it is easy enough for them to import and use there plugin code, so it's available when they use Mistune. Is there a preferred way or best practice for how users should publish their plugin? What is the best way one could find community-made plugins? If there are such things, I propose adding them to the docs.

    Taking some inspiration from Lektor, projects could be published to pypi with a mistune- prefix so they are discoverable via pypi's search bar, or you could maintain a docs page linking to them. Perhaps PyPA would allow a new trove classifier to list Mistune as a framework, as well (makes sense to me).

    I thought of this coming from the idea that Lektor could more easily pull in arbitrary Mistune plugins. https://github.com/lektor/lektor/issues/1076

    opened by nixjdm 1
  • abbr plugin error?

    abbr plugin error?

    image

    https://github.com/lepture/mistune/blob/1264a1c954396fa31304bfac39588381f015a5d8/mistune/plugins/abbr.py#L56-L63

    maybe here def_abbr and should the same as abbr? I run my code local no error, but on github action error occurs

    full log here https://github.com/teedoc/teedoc/actions/runs/3069280480/jobs/4957741255

    opened by Neutree 0
  • Rendering code comment as heading in html inside markdown

    Rendering code comment as heading in html inside markdown

    We're embedding some pyscript inside out markdown, and having some trouble with code comments being rendered as headings:

    It appears to be an issue within a custom tag where there is a blank line before the line prefaced with the #

    To reproduce:

    $ python3
    Python 3.10.4 (main, Jun 29 2022, 12:14:53) [GCC 11.2.0] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import mistune
    >>> mistune.__version__
    '2.0.4'
    >>> mistune.html(r'''<some-tag>
    ... 
    ... # a comment
    ... something with # a trailing comment
    ... </some-tag>
    ... 
    ... ''')
    '<some-tag>\n<h1>a comment</h1>\n<p>something with # a trailing comment</p>\n</some-tag>\n'
    

    What is expected:

    The code/comments within the custom tags (in our case they're <py-script></py-script>) should not be interpreted as markdown (there weren't in a pre-2.x version of mistune), and produce: <some-tag>\n<p># a comment</p>\n<p>something with # a trailing comment</p>\n</some-tag>\n

    When there is no blank line preceeding the code comment, this behavior is not observed:

    >>> mistune.html(r'''<some-tag>
    ... # a comment
    ... something with # a trailing comment
    ... </some-tag>''')
    '<some-tag>\n# a comment\nsomething with # a trailing comment\n</some-tag>\n'
    
    opened by toonarmycaptain 2
Releases(v2.0.4)
Owner
Hsiaoming Yang
This guy is too lazy to introduce himself.
Hsiaoming Yang
A markdown lexer and parser which gives the programmer atomic control over markdown parsing to html.

A markdown lexer and parser which gives the programmer atomic control over markdown parsing to html.

stonepresto 4 Aug 13, 2022
A fast, extensible and spec-compliant Markdown parser in pure Python.

mistletoe mistletoe is a Markdown parser in pure Python, designed to be fast, spec-compliant and fully customizable. Apart from being the fastest Comm

Mi Yu 546 Jan 1, 2023
Provides syntax for Python-Markdown which allows for the inclusion of the contents of other Markdown documents.

Markdown-Include This is an extension to Python-Markdown which provides an "include" function, similar to that found in LaTeX (and also the C pre-proc

Chris MacMackin 85 Dec 30, 2022
Mdformat is an opinionated Markdown formatter that can be used to enforce a consistent style in Markdown files

Mdformat is an opinionated Markdown formatter that can be used to enforce a consistent style in Markdown files. Mdformat is a Unix-style command-line tool as well as a Python library.

Executable Books 180 Jan 6, 2023
markdown2: A fast and complete implementation of Markdown in Python

Markdown is a light text markup format and a processor to convert that to HTML. The originator describes it as follows: Markdown is a text-to-HTML con

Trent Mick 2.4k Dec 30, 2022
A lightweight and fast-to-use Markdown document generator based on Python

A lightweight and fast-to-use Markdown document generator based on Python

快乐的老鼠宝宝 1 Jan 10, 2022
Static site generator that supports Markdown and reST syntax. Powered by Python.

Pelican Pelican is a static site generator, written in Python. Write content in reStructuredText or Markdown using your editor of choice Includes a si

Pelican dev team 11.3k Jan 5, 2023
A Python implementation of John Gruber’s Markdown with Extension support.

Python-Markdown This is a Python implementation of John Gruber's Markdown. It is almost completely compliant with the reference implementation, though

Python-Markdown 3.1k Dec 30, 2022
A Python implementation of John Gruber’s Markdown with Extension support.

Python-Markdown This is a Python implementation of John Gruber's Markdown. It is almost completely compliant with the reference implementation, though

Python-Markdown 3.1k Dec 31, 2022
Extensions for Python Markdown

PyMdown Extensions Extensions for Python Markdown. Documentation Extension documentation is found here: https://facelessuser.github.io/pymdown-extensi

Isaac Muse 685 Jan 1, 2023
Lightweight Markdown dialect for Python desktop apps

Litemark is a lightweight Markdown dialect originally created to be the markup language for the Codegame Platform project. When you run litemark from the command line interface without any arguments, the Litemark Viewer opens and displays the rendered demo.

null 10 Apr 23, 2022
A markdown template manager for writing API docs in python.

DocsGen-py A markdown template manager for writing API docs in python. Contents Usage API Reference Usage You can install the latest commit of this re

Ethan Evans 1 May 10, 2022
Livemark is a static page generator that extends Markdown with interactive charts, tables, and more.

Livermark This software is in the early stages and is not well-tested Livemark is a static site generator that extends Markdown with interactive chart

Frictionless Data 86 Dec 25, 2022
Read a list in markdown and do something with it!

Markdown List Reader A simple tool for reading lists in markdown. Usage Begin by running the mdr.py file and input either a markdown string with the -

Esteban Garcia 3 Sep 13, 2021
Yuque2md - Offline download the markdown file and image from yuque

yuque2md 按照语雀知识库里的目录,导出语雀知识库中所有的markdown文档,并离线图片到本地 使用 安装 Python3.x clone 项目 下载依

JiaJianHuang 4 Oct 30, 2022
Convert HTML to Markdown-formatted text.

html2text html2text is a Python script that converts a page of HTML into clean, easy-to-read plain ASCII text. Better yet, that ASCII also happens to

Alireza Savand 1.3k Dec 31, 2022
Comprehensive Markdown plugin built for Django

Django MarkdownX Django MarkdownX is a comprehensive Markdown plugin built for Django, the renowned high-level Python web framework, with flexibility,

neutronX 740 Jan 8, 2023
Awesome Django Markdown Editor, supported for Bootstrap & Semantic-UI

martor Martor is a Markdown Editor plugin for Django, supported for Bootstrap & Semantic-UI. Features Live Preview Integrated with Ace Editor Supporte

null 659 Jan 4, 2023
A super simple script which uses the GitHub API to convert your markdown files to GitHub styled HTML site.

A super simple script which uses the GitHub API to convert your markdown files to GitHub styled HTML site.

Çalgan Aygün 213 Dec 22, 2022