A Python implementation of John Gruber’s Markdown with Extension support.

Python-Markdown

Last update: Dec 31, 2022

Related tags

Markdown/YAML python markdown markdown-parser python3 python-markdown markdown-to-html

Overview

Python-Markdown

This is a Python implementation of John Gruber's Markdown. It is almost completely compliant with the reference implementation, though there are a few known issues. See Features for information on what exactly is supported and what is not. Additional features are supported by the Available Extensions.

Documentation

pip install markdown

import markdown
html = markdown.markdown(your_text_string)

For more advanced installation and usage documentation, see the docs/ directory of the distribution or the project website at https://Python-Markdown.github.io/.

See the change log at https://Python-Markdown.github.io/change_log.

Support

You may report bugs, ask for help, and discuss various other issues on the bug tracker.

Code of Conduct

Everyone interacting in the Python-Markdown project's codebases, issue trackers, and mailing lists is expected to follow the Code of Conduct.

Comments

Refactor HTML Parser

This is experimental. More of the HTMLParser methods need to be fleshed out. So far the basic stuff works as long as there is no invalid HTML in the document (which is untested at this point).

Input:

Some *Markdown* text.

<p>Some *raw* HTML</p>

<span>*inline*</span>

    <p>code block</p>

`<em>code span</em>`

<div>

foo *bar*

* baz bar

blah blah

</div>

More *Markdown*.

Output:

<p>Some <em>Markdown</em> text.</p>
<p>Some *raw* HTML</p>

<p><span><em>inline</em></span></p>
<pre><code>&lt;p&gt;code block&lt;/p&gt;
</code></pre>
<p><code>&lt;em&gt;code span&lt;/em&gt;</code></p>
<div>

foo *bar*

* baz bar

blah blah

</div>

<p>More <em>Markdown</em>.</p>

... which exactly matches the existing behavior.

I havn't actually run the tests on this yet, so I'm curious to see what Travis says...

approved

opened by waylan 51

Infinite execution on some input

With some input, i have a infinite execution, with markdown function, no exception raise.

Step to reproduce: https://gist.github.com/anonymous/ffab9ad433127893f04b9d009cd21444
bug

opened by dattaz 47

AttributeError: module 'importlib' has no attribute 'util' with python-markdown 3.4 on macOS/Windows

With python3.9 on macOS:

$ python3.9 -m venv venv
$ source venv/bin/activate
$ pip install markdown
Collecting markdown
  Using cached Markdown-3.4-py3-none-any.whl (93 kB)
Collecting importlib-metadata>=4.4; python_version < "3.10"
  Using cached importlib_metadata-4.12.0-py3-none-any.whl (21 kB)
Collecting zipp>=0.5
  Using cached zipp-3.8.1-py3-none-any.whl (5.6 kB)
Installing collected packages: zipp, importlib-metadata, markdown
Successfully installed importlib-metadata-4.12.0 markdown-3.4 zipp-3.8.1
WARNING: You are using pip version 20.2.3; however, version 22.1.2 is available.
You should consider upgrading via the '/Users/mike/tmp/resume.md/venv/bin/python3.9 -m pip install --upgrade pip' command.
$ python
Python 3.9.4 (default, Apr 16 2021, 21:18:07)
[Clang 12.0.0 (clang-1200.0.32.29)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import markdown
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/mike/tmp/resume.md/venv/lib/python3.9/site-packages/markdown/__init__.py", line 22, in <module>
    from .core import Markdown, markdown, markdownFromFile
  File "/Users/mike/tmp/resume.md/venv/lib/python3.9/site-packages/markdown/core.py", line 27, in <module>
    from .preprocessors import build_preprocessors
  File "/Users/mike/tmp/resume.md/venv/lib/python3.9/site-packages/markdown/preprocessors.py", line 29, in <module>
    from .htmlparser import HTMLExtractor
  File "/Users/mike/tmp/resume.md/venv/lib/python3.9/site-packages/markdown/htmlparser.py", line 29, in <module>
    spec = importlib.util.find_spec('html.parser')
AttributeError: module 'importlib' has no attribute 'util'
>>>

With python3.10 on macOS:

$ python3.10 -m venv 3.10
$ source 3.10/bin/activate
$ pip install markdown
Collecting markdown
  Using cached Markdown-3.4-py3-none-any.whl (93 kB)
Installing collected packages: markdown
Successfully installed markdown-3.4
WARNING: You are using pip version 22.0.4; however, version 22.1.2 is available.
You should consider upgrading via the '/Users/mike/tmp/resume.md/3.10/bin/python3.10 -m pip install --upgrade pip' command.
$ python
Python 3.10.3 (main, Mar 25 2022, 22:16:41) [Clang 12.0.5 (clang-1205.0.22.9)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import markdown
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/mike/tmp/resume.md/3.10/lib/python3.10/site-packages/markdown/__init__.py", line 22, in <module>
    from .core import Markdown, markdown, markdownFromFile
  File "/Users/mike/tmp/resume.md/3.10/lib/python3.10/site-packages/markdown/core.py", line 27, in <module>
    from .preprocessors import build_preprocessors
  File "/Users/mike/tmp/resume.md/3.10/lib/python3.10/site-packages/markdown/preprocessors.py", line 29, in <module>
    from .htmlparser import HTMLExtractor
  File "/Users/mike/tmp/resume.md/3.10/lib/python3.10/site-packages/markdown/htmlparser.py", line 29, in <module>
    spec = importlib.util.find_spec('html.parser')
AttributeError: module 'importlib' has no attribute 'util'

pip install "markdown<3.4" works, so this is perhaps a regression in the 3.4 release?

opened by mikepqr 39

Abandon or Modify ElementTree?
The short version:

As part of version 3.0 (see #391) should Python-Markdown perhaps abandon ElementTree for a different document object like Docutils' node tree or use a modified ElementTree for internally representing the Parsed HTML document?

Any and all feedback is welcome.

The long version:

Starting in Python-Markdown version 2.0, internally parsed documents have been represented as ElementTree objects. While this mostly works, there are a few irritations. ElementTree (hereinafter ET) was designed for XML, not HTML and therefore a few of its design choices are less than ideal when working with HTML.

For example, by design, XML does not generally have text and child nodes interspersed like HTML does. While ET provides text and tail attributes on each element, it is not as easy to work with as it would be if the text was contained in child "TextNodes" (much like JavaScript's DOM). Additionally, ET nodes have no knowledge of their parent(s), which can be a problem in certain HTML specific situations (some elements cannot contain other elements as children or grandchildren or great-grandchildren...).

I see two possible workarounds to this: Modify ET or use a different type of object.

Modifying ElementTree

We already have a modified serializer which gives us better HTML output (it is actually a modified HTML serializer from ET) and we already import ET and document that all extensions should import ET from Markdown. Therefore, if we were to change anything (via subclasses, etc) those changes would propagate throughout all extensions without too much change.

In fact, some time ago, I played around with the idea of making ET nodes aware of their parents. While it worked, I quickly abandoned it as I realized that it would not work for cElementTree. However, on further consideration, we don't really need cElementTree (most of the benefits are in a faster XML parser which we don't use).

Interestingly, in Python 3.3 cElementTree is deprecated. What actually happens is that ET defines the Python implementation and then at the bottom of the module, it tries to import the C implementation, which upon success, overrides the Python objects of the same name. What is interesting about this is that the Python implementation of the Element class (ET's node object) is preserved as _Element_Py for external code which needs access to it (as explained in the comments).

I envision a modified ET lib to basically subclass the Python Element object to enforce knowledge of parents for all nodes. Then a TextNode would be created which works essentially like Comments work now:

def TextElement(text=None): element = Element(TextElement) element.text = text return element

The serializer would then be updated to properly output TextElements. In fact, at some point, the serializer might even be able to loose knowledge of the text and tail attributes on regular nodes. However, that last bit could wait for all extensions to adopt the new stuff.

In addition to TextElement we could also have RawTextElement and AtomicTextElement. Both would be ignored by the parser (no additional parsing would take place). However, a RawTextElement would be given special treatment by the serializer in that no escaping would take place (raw HTML could be stored inline in the document rather than in a separate store with placeholders in the document), whereas an AtomicTextElement would be serialized like a regular TextElement.

The advantage of an AtomicTextElement (over the existing AtomicString) is that a single node could have multiple child text nodes. Today, each node only gets one text attribute. Therefore, when a AtomicString is concatenated with an existing text string, we lose the 'atomic' quality of the sub-string. However, with this change each sub-string can reside in its own separate text node and maintain the 'atomic' quality when necessary.

Using Docutils

Rather that creating our own one-off hacked version of ET, we could instead use an already existing library which gives us all of the same features (and more). Today, the only widely supported and stable library I'm aware of is Docutils' Document Tree. While the Document Tree is described as an XML representation of a document, Docutils provides a Python API to work with the Document Tree which is very similar to the modified ET API I described above (known parents, TextElement, FixedTextElement...). Unfortunately that API is not documented. Although, the the source code is easy enough to follow.

Until recently, I was of the assumption that to implement something that used Docutils, one would need to define a bunch of directives (etc) which more-or-less modify the ReST parser. However, take a look at the Overview of the Docutils Architecture. A parser simply needs to create a node tree. In fact, the base Parser class is only a few simple methods. The entire directives thing is in a separate directory under the ReST Parser only. Theoretically, one could subclass the base Parser class, and build a node tree using whatever parsing method desired and Docutils wouldn't care.

For that matter, Python-Markdown would not have to replicate Docutils "Parser" API. We could just use the node tree internally. As a plus, this would give us access to all of the built-in and third party Docutils writers (serializers). In other words, we would get all of Docutils output formats for free.

Additionally, Docutils' node tree also provides for various meta-data to be stored within the node tree. For example, each node can contain the line and column at which its contents were found in the original source document. This provides an easy way for a spellchecker to run against the parser and report misspelled words in the document without first converting it to HTML, among other uses which do not require serialized output.

No, this would not make Python-Markdown suddenly able to be supported by Sphinx. Sphinx is mostly a collection of custom directives built on top of the ReST parser. ReST directives do not make sense in Markdown. However, we could convert Markdown to ReST as many other third party parsers convert various formats to ReST via a ReST writer. There is also at least one third party writer which outputs Markdown from a node tree. By adopting Docutils node tree, Python-Markdown could become part of an ecosystem for converting between all sorts of various document formats (an expandable competitor to Pandoc?).

The downsides to using Docutils are that we are then relying on a third party library (up till now, Python-Markdown has not) and all extensions would absolutely be forced to change to support the new version. It is also possible that we wouldn't be able to use the available HTML writer as the default because of some inherent differences with Markdown and ReST (ReST is much more verbose and we might need to hack the node tree or the writer to get the writer to output correct HTML from a Markdown perspective -- I have not investigated this).

As it stands now, there are various small changes required of extensions between version 2 and 3, but I expect that most extensions would be able to support both without much effort. If we went with Docutils, that would no longer be the case.

Or, maybe this whole thing is a bad idea and we should just continue to use ET as-is.

Any and all feedback is welcome.
feature core needs-decision
opened by waylan 39
Bold/Italic bug
I think I'm running up against another bold/italics bug. I did some quick searches and it looks like the other issues were considered resolved, sorry if I'm re-reporting on something already fixed that hasn't made it upstream yet.

Installed from pip current Python-Markdown version 3.0.1

The raw markdown line that breaks is:

This is text **bold *italic bold*** with more text

The output I'm getting is as follows:

<p>This is text <strong>bold *italic bold</strong>* with more text</p>

However, the following format does seem to work correctly.

This is text ***bold italic** italic* more text

The output is

<p>This is text <em><strong>bold italic</strong> italic</em> more text</p>
bug confirmed
opened by Dave-ts 27
Add SmartyPants extension as part of Python-Markdown

This is a feature request. It'd be nice if there was a built-in (batteries included) extension to implement SmartyPants quoting by turning on a simple extension.

I notice that someone is already using SmartyPants with Markdown for Python, though not as an extension: http://byrneswoder.com/blog/one-secret-to-generating-clean-html-from-text/
feature someday-maybe

opened by david-a-wheeler 27
Markdown in raw HTML stops working after first raw HTML tag
Hello,

I'm using Markdown and the "extra" extension to support Markdown in raw HTML div elements with the attribute markdown='1' as explained on the example page: https://pythonhosted.org/Markdown/extensions/extra.html

However, as soon as a raw HTML tag without the "markdown" attribute occurs inside the elment, Markdown will not be processed anymore after the end of that element. If you put any Markdown code after the "Raw html blocks may also be nested." text of the example page, it will not be processed even though it is still in the "markdown='1'" div.

Short example with 3 Markdown uses where the second one is not processed:

<div markdown="1"> Markdown is *active* here. <div name="RawHtml"> Raw html blocks may also be nested. </div> Markdown is *not* active anymore here. </div> Markdown is *active again* here.
opened by fpw 26

Fix InlineProcessor and add better italic and bold support

Changes

Fixes issues with tails in InlineProcessor
Adds better italic and bold support
New tests for changes and improved code coverage

Tests

I cannot install Python 3.1 on my OSX Mavericks

  py27: commands succeeded
ERROR:   py31: InterpreterNotFound: python3.1
  py32: commands succeeded
  py33: commands succeeded
  py34: commands succeeded

Name                               Stmts   Miss  Cover   Missing
----------------------------------------------------------------
markdown/__init__                    193     42    78%   141, 330-333, 391-434, 479-493
markdown/__main__                     40      0   100%
markdown/blockparser                  30      0   100%
markdown/blockprocessors             273      7    97%   189, 194-195, 204, 253, 546, 552
markdown/extensions/__init__          34      4    88%   28-29, 36-37
markdown/extensions/abbr              38      0   100%
markdown/extensions/admonition        45      0   100%
markdown/extensions/attr_list         96      0   100%
markdown/extensions/codehilite        99     19    81%   25-27, 43-44, 105-120, 177-178, 181
markdown/extensions/def_list          59      2    97%   95-96
markdown/extensions/extra             54      0   100%
markdown/extensions/fenced_code       48      0   100%
markdown/extensions/footnotes        176      8    95%   91-92, 105, 111, 118, 243, 288-289
markdown/extensions/headerid          84      4    95%   72-73, 75, 103
markdown/extensions/meta              35      0   100%
markdown/extensions/nl2br             11      0   100%
markdown/extensions/sane_lists        17      0   100%
markdown/extensions/smart_strong      14      0   100%
markdown/extensions/smarty            87      0   100%
markdown/extensions/tables            56      0   100%
markdown/extensions/toc              134     18    87%   50-52, 96-104, 118-120, 151, 182, 191
markdown/extensions/wikilinks         49      0   100%
markdown/inlinepatterns              247      1    99%   225
markdown/odict                       113     37    67%   25-32, 35, 42, 54, 57, 60-66, 69-71, 104-105, 108-110, 119-122, 129, 136, 139-140, 160, 185-189
markdown/postprocessors               49      0   100%
markdown/preprocessors               207     17    92%   87, 92, 116, 135, 171, 199, 273, 292-304
markdown/serializers                 153     48    69%   82-83, 106-117, 143, 147-150, 158, 160, 165, 169-174, 181, 203, 218, 224-235, 238, 254, 259, 262, 266, 269
markdown/treeprocessors              187      4    98%   80, 203-205
markdown/util                         59      0   100%
----------------------------------------------------------------
TOTAL                               2687    211    92%

The only modification that had to be made to existing tests were for these two issues (which I view as improvements):

--- /Users/facelessuser/Desktop/Python-Markdown/tests/misc/para-with-hr.html
+++ actual_output.html
@@ -2,5 +2,5 @@
 <hr />
 <p>Followed by another paragraph.</p>
 <p>Here is another paragraph, followed by:
-*** not an HR.
+<em>*</em> not an HR.
 Followed by more of the same paragraph.</p>

--- /Users/facelessuser/Desktop/Python-Markdown/tests/misc/em_strong.html
+++ actual_output.html
@@ -4,7 +4,7 @@
 <p>With spaces: * *</p>
 <p>Two underscores __</p>
 <p>with spaces: _ _</p>
-<p>three asterisks: ***</p>
+<p>three asterisks: <em>*</em></p>
 <p>with spaces: * * *</p>
 <p>three underscores: ___</p>
 <p>with spaces: _ _ _</p>

Let me know what you think.

opened by facelessuser 25

Deadlock: never ending match() in treeprocessors.py!

Hi, match = pattern.getCompiledRegExp().match(data[startIndex:]) never ends and hangs python process. This happens in v2.6.11 python 2/3 and I guess later version are affected as well. It happens only with certain input data and with patternIndex = 2. Please see the python file attached with the sample code, pattern #2 and data. reg.py.txt
invalid

opened by vladsf 24
Replace homegrown OrderedDict with purpose-built Registry.

All processors and patterns now get "registered" to a Registry. Each item is given a name (string) and a priority. The name is for later reference and the priority can be either an integer or float and is used to sort. A Registry instance is a list-like iterable with the items auto-sorted by priority. If two items have the same priority, then they are listed in the order there were "registered". Registering a new item with the same name as an already registered item replaces the old item with the new item (however, the new item is sorted by its priority). To remove an item, "deregister" it by name or index.

Fixes #418.

Note that this is an adaptation of #510 which has been rebased onto master.

opened by waylan 24
Replace OrderedDict with prioritized List?
I'm looking for feedback on a possible Extension API change which might be introduced in version 3.0.

Currently (version 2.x), all extensions register where they are run within the parser with our homegrown OrderedDict. Each piece of code is assigned a name in the dict and an extension inserts itself before or after a given name.

patterns.add(SomePattern(), '<emphasis') # insert before "emphasis" pattern del patterns['emphasis'] # remove "emphasis" pattern

What I am proposing is that instead of an Ordered Dict, we use a list (as we did in version 1.x). However, each item in the list is assigned a "priority" attribute which is used to sort the list in order. For example, each inlinepattern class would have a priority set (10, 15, 20, 25, 30, ...). Higher numbers get run first. An extension could set a priority of 22 to get placed between items with priorities of 20 & 25. If a second extension needed to also be between 20 & 25 but before the extension with priority 22, is could use priority 23 or 24, and we don't have the possible conflicts that exist now.

The tricky part would be in removing existing patterns. It is easy to do with the named keys. It might be a little more tricky without. And we can't hardcode index position as that can be changed by other extensions. The entire list would need to be searched for the given class instance. Do we set a "name" property on each class for this reason, rely only on the "priority" property, or something else? Perhaps the built-ins could all be assigned to constants. That way, the constant (a text-based name) could be used for reference purposes, but the value would be the integer (much like the logging modules error codes). Or maybe the constants could point to the class instances themselves.

Therefore, where patterns is the list of patterns, one might alter the list like this:

patterns.register(SomePattern(), priority=23) patterns.deregister(inlinepatterns.EMPHASIS)

I'm using register/deregister as opposed to register/unregister for the reasons stated here (although it could change by popular demand). However, what I'm not sure about is the best way to define the priority:

This is odd as the priority is a function of the registration process, not creation of the class instance:

myinstance = MyPattern(priority=23) patterns.register(myinstance)

This is easy to understand but then requires the parser to monkeypatch the class instance to attach the priority to the class:

myinstance = MyPattern() patterns.register(myinstance, priority=23)

In the first example, register is simply an alias to list.append. However, the second example would require something like this:

class PriorityList(list): def register(self, item, priority): item.priority = priority # the monkeypatch self.append(item)

Of course, regardless of implementation details, we would only sort once after all extensions are loaded.

WHY?

Provides more flexibility to extension authors. Multiple third-party-extensions would be less likely to conflict with each other.

For example, in the current situation, Extension A removes "emphasis" and extension B tries to insert before emphasis (<emphasis). If the user lists the extensions [B, A], everything works fine, but if she lists them as [A, B], then a KeyError will be raised when setting up extension B.

Or two extensions might use the same name. For example multiple "math" extensions currently exist but each does something slightly different. A user could conceivably try to use two of them together, but one might replace the other.

Gets rid of the awkward <emphasis syntax. Ugh.

Removes the homegrown (and mostly untested) OrderedDict (the implementation that ships with Python only allows adding to the end so it is useless for this purpose). The fact that the new implementation is just a sorted list is an implementation detail that does not even need to be mentioned in the docs. Extension authors only need to know about and use the two methods register and deregister on a registry.

Any and all feedback is welcome.
opened by waylan 24

list render

The following would not be rendered as a lsit

**[Usage](http://127.0.0.1:8000/cms/pages/82/edit/)**：
- First item
- Second item
- Third item
- Fourth item

However, if you add a new line below the first line, that is,

**[Usage](http://127.0.0.1:8000/cms/pages/82/edit/)**：

- First item
- Second item
- Third item
- Fourth item

it works.

Any way to make the first one work？I know editors like stackedit and dillinger accept it.

Code to test:

import markdown
s1='''**[Usage](http://127.0.0.1:8000/cms/pages/82/edit/)**：
- First item
- Second item
- Third item
- Fourth item'''

s2='''**[Usage](http://127.0.0.1:8000/cms/pages/82/edit/)**：

- First item
- Second item
- Third item
- Fourth item'''


html = markdown.markdown(s1)
print(html)

opened by redstoneleo 0

How do you add a custom PreProcessor?

Effectively the title.

https://python-markdown.github.io/extensions/api/#example describes how to define a custom PreProcessor, but I can't figure out how the heck you're supposed to then use it.

Looking around at some other issues it seems that people are defining a custom extension just to inject their preprocessor. That seems a bit janky of a solution. I could I suppose just append my preprocessor to Markdown.preprocessors after the main Markdown() class has been instantiated, but monkey-patching like that seems fragile.

What is the proper way to use a custom preprocessor? And would it be possible to get the documentation updated to show the new method?

opened by fake-name 0
Update the year in copyright — It's 2023!

At the footer of the doc site, copyright is shown.

https://github.com/Python-Markdown/markdown/blob/4dab9a7436357173fad08a0f4e67a63daaaa15a6/mkdocs.yml#L5

opened by YDX-2147483647 0
SmartyPants: Apostrophes at the start of leading contractions
As mentioned in the original project here, "SmartyPants will turn the apostrophe into an opening single-quote, when in fact it should be a closing one."

This is specifically presenting an issue for me when smarty collides with abbr. For instance, this code:

Each section of the editor controls an access group's ability to view, and actions within, the listed pages. Select the checkboxes to allow the appropriate actions. *[access group]: Access groups allow admin users to<br>assign custom permissions to all<br>users assigned to that group.

is rendering like this:

because when access group gets rendered by abbr, the ' is no longer recognized as being mid-word and instead renders as a leading single quote for the possessive s.

I've implemented a stopgap via javascript, but there are a lot of different contraction situations and it would be great to have a real fix.
bug extension confirmed
opened by feasgal 4

Cannot run as module

python -m markdown <filename>.md

fails with:

Traceback (most recent call last):
  File "/usr/lib/python3.10/importlib/util.py", line 96, in find_spec
    parent_path = parent.__path__
AttributeError: module 'html' has no attribute '__path__'. Did you mean: '__name__'?

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 187, in _run_module_as_main
    mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
  File "/usr/lib/python3.10/runpy.py", line 146, in _get_module_details
    return _get_module_details(pkg_main_name, error)
  File "/usr/lib/python3.10/runpy.py", line 110, in _get_module_details
    __import__(pkg_name)
  File "/home/ervin/.local/lib/python3.10/site-packages/markdown/__init__.py", line 22, in <module>
    from .core import Markdown, markdown, markdownFromFile
  File "/home/ervin/.local/lib/python3.10/site-packages/markdown/core.py", line 27, in <module>
    from .preprocessors import build_preprocessors
  File "/home/ervin/.local/lib/python3.10/site-packages/markdown/preprocessors.py", line 29, in <module>
    from .htmlparser import HTMLExtractor
  File "/home/ervin/.local/lib/python3.10/site-packages/markdown/htmlparser.py", line 29, in <module>
    spec = importlib.util.find_spec('html.parser')
  File "/usr/lib/python3.10/importlib/util.py", line 98, in find_spec
    raise ModuleNotFoundError(
ModuleNotFoundError: __path__ attribute not found on 'html' while trying to find 'html.parser'

Any ideas?

markdown_py works fine.

Running on Arch currently on linux-next.

more-info-needed

opened by ervinpopescu 4

RFC: Should `markdown` raise custom exceptions?
I recently reviewed some code like this:

try: markdown.markdown(…) except: return 'Rendering failed!'

… and I was not happy with the overly broad exception clause.

I decided to make a concrete suggestion for improvement by at least catching the markdown exception base class when I discovered that none exists and that markdown raises very generic Python exceptions, such as TypeError, ValueError etc.

I was surprised by this, as I'm used to Python libraries using their own custom exception hierarchies and not doing so encourages users to write code like the one above.

So I wanted to suggest introducing a custom exception hierarchy for markdown.

I'd volunteer to make the changes, as they would be rather "mechanical" and do not require deep familiarity with the code, but I wanted to trigger a little discussion as to the merits of this idea, before doing all the work and it opssibly getting rejected.
needs-decision
opened by der-gabe 4

Owner

Python-Markdown

A Python implementation of John Gruber’s Markdown with extensions.

GitHub https://python-markdown.github.io/

Provides syntax for Python-Markdown which allows for the inclusion of the contents of other Markdown documents.

Markdown-Include This is an extension to Python-Markdown which provides an "include" function, similar to that found in LaTeX (and also the C pre-proc

85 Dec 30, 2022

Mdformat is an opinionated Markdown formatter that can be used to enforce a consistent style in Markdown files

Mdformat is an opinionated Markdown formatter that can be used to enforce a consistent style in Markdown files. Mdformat is a Unix-style command-line tool as well as a Python library.

180 Jan 6, 2023

A markdown lexer and parser which gives the programmer atomic control over markdown parsing to html.

4 Aug 13, 2022

Markdown parser, done right. 100% CommonMark support, extensions, syntax plugins & high speed. Now in Python!

markdown-it-py Markdown parser done right. Follows the CommonMark spec for baseline parsing Configurable syntax: you can add new rules and even replac

398 Dec 24, 2022

markdown2: A fast and complete implementation of Markdown in Python

Markdown is a light text markup format and a processor to convert that to HTML. The originator describes it as follows: Markdown is a text-to-HTML con

2.4k Dec 30, 2022

A fast yet powerful Python Markdown parser with renderers and plugins.

Mistune v2 A fast yet powerful Python Markdown parser with renderers and plugins. NOTE: This is the re-designed v2 of mistune. Check v1 branch for ear

2.2k Jan 4, 2023

Static site generator that supports Markdown and reST syntax. Powered by Python.

Pelican Pelican is a static site generator, written in Python. Write content in reStructuredText or Markdown using your editor of choice Includes a si

11.3k Jan 5, 2023

Extensions for Python Markdown

PyMdown Extensions Extensions for Python Markdown. Documentation Extension documentation is found here: https://facelessuser.github.io/pymdown-extensi

685 Jan 1, 2023

A fast, extensible and spec-compliant Markdown parser in pure Python.

mistletoe mistletoe is a Markdown parser in pure Python, designed to be fast, spec-compliant and fully customizable. Apart from being the fastest Comm

546 Jan 1, 2023

Lightweight Markdown dialect for Python desktop apps

Litemark is a lightweight Markdown dialect originally created to be the markup language for the Codegame Platform project. When you run litemark from the command line interface without any arguments, the Litemark Viewer opens and displays the rendered demo.