Safely add untrusted strings to HTML/XML markup.

The Pallets Projects

Last update: Dec 31, 2022

Related tags

HTML Manipulation python html template-engine html-escape jinja markupsafe pallets

Overview

MarkupSafe

MarkupSafe implements a text object that escapes characters so it is safe to use in HTML and XML. Characters that have special meanings are replaced so that they display as the actual characters. This mitigates injection attacks, meaning untrusted user input can safely be displayed on a page.

Installing

Install and update using pip:

pip install -U MarkupSafe

Examples

>>> from markupsafe import Markup, escape

>>> # escape replaces special characters and wraps in Markup
>>> escape("<script>alert(document.cookie);</script>")
Markup('&lt;script&gt;alert(document.cookie);&lt;/script&gt;')

>>> # wrap in Markup to mark text "safe" and prevent escaping
>>> Markup("<strong>Hello</strong>")
Markup('<strong>hello</strong>')

>>> escape(Markup("<strong>Hello</strong>"))
Markup('<strong>hello</strong>')

>>> # Markup is a str subclass
>>> # methods and operators escape their arguments
>>> template = Markup("Hello <em>{name}</em>")
>>> template.format(name='"World"')
Markup('Hello <em>&#34;World&#34;</em>')

Donate

The Pallets organization develops and supports MarkupSafe and other libraries that use it. In order to grow the community of contributors and users, and allow the maintainers to devote more time to the projects, please donate today.

Links

Website: https://palletsprojects.com/p/markupsafe/
Documentation: https://markupsafe.palletsprojects.com/
Releases: https://pypi.org/project/MarkupSafe/
Code: https://github.com/pallets/markupsafe
Issue tracker: https://github.com/pallets/markupsafe/issues
Test status: https://dev.azure.com/pallets/markupsafe/_build
Official chat: https://discord.gg/t6rrQZH

Comments

Unicode incorrectly escaped on PyPy 7.3.1

On PyPy (specifically, the wheel with _speedups.pypy36-pp73-x86_64-linux-gnu.so), when using markupsafe.escape(), this unicode below is incorrectly escaped;

Input: 'https://x?<ab c>&q"+%3D%2B"="fö%26=o"'

_native: 'https://x?<ab c>&q"+%3D%2B"="fö%26=o"' _speedups (on pypy): 'https://x?<ab c>&q"+%3D%2B"="fÃ¶%26=o'

This was caught while testing PyPy support for synapse, in https://github.com/matrix-org/synapse/pull/9123, in which one of the tests hooked behind here.

I am no expert in C, but I have a feeling one of the Unicode APIs used in _speedups.c is either being used incorrectly, or has an implementation mismatch compared to cPython

opened by ShadowJonathan 26
Consider using the stable API when building wheels for CPython
I noticed your tweet about having lots of wheels (thanks for the PyPy one BTW). Perhaps you could consider creating stable ABI wheels? I think if you add

[bdist_wheel] py-limited-api = cp34

to the setup.cfg then python setup.py bdist_wheel will build something like MarkupSafe-1.1.1-cp34-abi3-macosx_10_6_intel.whl which any cpython>-3.4 will support on macosx. This would save having to create a cpython-3.4, cpython3.5, ... wheels and would also mean you would future proof yourself for any new versions of cpython.

You might need to adjust the CI wheel build to only build the wheel once, since the various versions of CPython will build a wheel with the same name, I think bdist_wheel is unhappy if a wheel already exists.
opened by mattip 26
build py_limited_api abi3 wheel and MacOS universal2 wheel

Use py_limited_api to build one wheel for 3.6+. Currently testing this.

Use cibuildwheel as a GitHub action. Don't need setup-python or pip install.

Build a MacOS universal2 wheel. From what I could tell from the cibuildwheel docs, this seems to be preferred over building the arm64 wheel.

closes #175

opened by davidism 17

Won't install on Windows

Python 3.6 (x86-64) pip == 9.0.1 setuptools == 34.3.2

(no36) D:\Git\project>pip install --force --upgrade markupsafe
Collecting markupsafe
  Using cached MarkupSafe-1.0.tar.gz
Building wheels for collected packages: markupsafe
  Running setup.py bdist_wheel for markupsafe ... error
  Failed building wheel for markupsafe
  Running setup.py clean for markupsafe
Failed to build markupsafe
Installing collected packages: markupsafe
  Found existing installation: markupsafe 1.0
    Uninstalling markupsafe-1.0:
      Successfully uninstalled markupsafe-1.0
  Running setup.py install for markupsafe ... error
  Rolling back uninstall of markupsafe
Exception:
Traceback (most recent call last):
  File "d:\python\no36\lib\site-packages\pip\compat\__init__.py", line 73, in console_to_str
    return s.decode(sys.__stdout__.encoding)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 66: invalid continuation byte

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "d:\python\no36\lib\site-packages\pip\basecommand.py", line 215, in main
    status = self.run(options, args)
  File "d:\python\no36\lib\site-packages\pip\commands\install.py", line 342, in run
    prefix=options.prefix_path,
  File "d:\python\no36\lib\site-packages\pip\req\req_set.py", line 784, in install
    **kwargs
  File "d:\python\no36\lib\site-packages\pip\req\req_install.py", line 878, in install
    spinner=spinner,
  File "d:\python\no36\lib\site-packages\pip\utils\__init__.py", line 676, in call_subprocess
    line = console_to_str(proc.stdout.readline())
  File "d:\python\no36\lib\site-packages\pip\compat\__init__.py", line 75, in console_to_str
    return s.decode('utf_8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 66: invalid continuation byte

opened by debnet 17

ImportError: cannot import name 'soft_unicode' from 'markupsafe' (/usr/local/lib/python3.8/site-packages/markupsafe/__init__.py)

With the new release getting the below error. It is working fine with version= 2.0.1

ImportError: cannot import name 'soft_unicode' from 'markupsafe' (/usr/local/lib/python3.8/site-packages/markupsafe/init.py)

opened by nitsharm1910 11
Build without speedups

I've been building MarkupSafe using the --without-speedups option, because I have a requirement to distribute this module across various OSes in my network. All, of a sudden, I now see that this option is no longer available for setup.py. How can I build a pure-python setup without building the C extension, for distribution, as I previously did?

opened by aabdnn 11

Fixed format for keyword arguments

In [1]: from markupsafe import Markup                                                                                                            

In [2]: Markup(u'{a}').format(a=u'<>')
Out[2]: Markup(u'<>')

In [3]: Markup(u'{}').format(u'<>')                                                                                                             
Out[3]: Markup(u'&lt;&gt;')

opened by anti-social 11

Breaking change in 2.1.0
After installing a tiers tool, this error occured:

File "C:\Users\yves.pipault\AppData\Roaming\Python\Python310\site-packages\jinja2\utils.py", line 642, in from markupsafe import Markup, escape, soft_unicode

Environment:

Python version: 3.10

MarkupSafe version: 2.1.0

Workaround: pip install markupsafe==2.0.1

According to changelog of version 2.1.0, we can read:

Remove soft_unicode, which was previously deprecated. Use soft_str instead. #261

That is a BREAKING CHANGE so you SHOULD increase the MAJOR version according to semantic versionning: https://semver.org/
opened by yvsppt 10
ImportError: cannot import name 'soft_unicode' from 'markupsafe'
requirements.txt: flask~=1.1.2 pandas~=1.2.2 Werkzeug~=1.0.1 numpy~=1.20.1 twilio~=7.2.0 waitress~=2.0.0 pyjwt~=2.3.0 cryptography~=36.0.1

python -m venv venv venv\scripts\activate pip install -r requirements.txt

python app.py

Error: (venv) D:\SonicApi>python app.py Traceback (most recent call last): File "D:\SonicApi\app.py", line 8, in from flask import Flask, request, abort, jsonify File "D:\SonicApi\venv\lib\site-packages\flask_init_.py", line 14, in from jinja2 import escape File "D:\SonicApi\venv\lib\site-packages\jinja2_init_.py", line 12, in from .environment import Environment File "D:\SonicApi\venv\lib\site-packages\jinja2\environment.py", line 25, in from .defaults import BLOCK_END_STRING File "D:\SonicApi\venv\lib\site-packages\jinja2\defaults.py", line 3, in from .filters import FILTERS as DEFAULT_FILTERS # noqa: F401 File "D:\SonicApi\venv\lib\site-packages\jinja2\filters.py", line 13, in from markupsafe import soft_unicode ImportError: cannot import name 'soft_unicode' from 'markupsafe' (D:\SonicApi\venv\lib\site-packages\markupsafe_init_.py)

Python version: Python 3.9.10

MarkupSafe version: MarkupSafe 2.1.0
opened by nachiketrss 9
Build aarch64 wheels

From v1.8.0, cibuildwheel allows for building non-native architectures with the CIBW_ARCHS_LINUX option.

See https://cibuildwheel.readthedocs.io/en/stable/options/#archs for more details.

opened by janaknat 8
Tests fail under pypy/pypy3 due to reliance on garbage collection policies.

Howdy!

MarkupLeakTestCase is the test in question. Under pypy, which does not run the garbage collection routines when expected, more objects will be returned via gc.get_objects() than one would expect.

The test should be skipped on non-CPython platforms.

opened by amcgregor 8
Imports from _speedups have side effects
Description

It appears that (binary!) imports from markupsafe._speedups have side-effects in a pylint invocation. The file that is linted attempts to import a missing package in a try-except clause. A common invocation of pylint produces empty (plug-in) reports if markupsafe is imported before pylint is linting (e.g. imported by a pylint plugin). However, output is produced as expected if either:

the missing import is installed; or

the try block is replaced by a non-import statement (e.g. raise ValueError); or

markupsafe.__init__ is modified to prefer imports from markupsafe._native; or

one imports a Cython package, e.g. ormsgpack, instead of markupsafe.

The first two observations may help in determining the root issue, but are not the problem themselves. The second two observations are what made me report the issue at MarkupSafe, as that suggests that the issue is specific to this package. However: I don't know where the root cause is, so I'm submitting this issue to both MarkupSafe and Pylint and trust that the community has a more complete understanding of the powers at play.

The issue at Pylint (copy of this issue): https://github.com/PyCQA/pylint/issues/8026.

This issue lies at the root of https://gitlab.com/smueller18/pylint-gitlab/-/issues/18.

Setup

I have two files in the same directory: invoke_pytest.py and my_module.py

# invoke_pytest.py import markupsafe # this import is not used -- it can only contribute side-effects or invoke pylint side-effects! from pylint import run_pylint run_pylint( argv=[ "--output-format=pylint_junit.JUnitReporter:pylint-report.xml", "my_module.py", ] )

# my_module.py try: import gmpy2 except ImportError: pass

I have a virtual environment in which I installed pylint pylint-junit markupsafe. pylint-junit is merely needed to show that a pylint plugin becomes unable to produce output. The issue surfaced originally in pylint-gitlab, which has markupdown as a dependency through jinja2.

$ python invoke_pytest.py; head pylint-report.xml

Expected output

After commenting import markupsafe from invoke_pytest.py, the output is as expected:

$ python invoke_pytest.py; head pylint-report.xml <?xml version="1.0" ?> <testsuites disabled="0" errors="0" failures="2" tests="4" time="0.0"> <testsuite disabled="0" errors="0" failures="0" name="Command line or configuration file" skipped="0" tests="1" time="0"> <testcase name="Command line or configuration file:0:0" classname="pylint"> <system-out>All checks passed for: None</system-out> </testcase> </testsuite> <testsuite disabled="0" errors="0" failures="2" name="my_module" skipped="0" tests="3" time="0"> <testcase name="my_module:0:0" classname="pylint" file="my_module.py"> <system-out>All checks passed for: my_module.py</system-out>

As mentioned above, the same output is achieved if markupsafe.__init__ is adjusted to not import from ._speedups.

Environment

Windows 10 -> WSL2 -> Ubuntu

$ python --version Python 3.10.9 $ pip list Package Version ----------------- ------- astroid 2.12.13 dill 0.3.6 isort 5.11.4 junit-xml-2 1.9 lazy-object-proxy 1.9.0 MarkupSafe 2.1.1 mccabe 0.7.0 pip 22.3.1 platformdirs 2.6.2 pylint 2.15.9 pylint-junit 0.3.2 setuptools 65.5.0 six 1.16.0 tomli 2.0.1 tomlkit 0.11.6 wrapt 1.14.1

I understand that the example is not quite minimal yet -- there is a lot of pylint stuff under the hood. Again, let me emphasize that I reach out to both communities and I hope to jointly reduce the example shortly.

I want to finish this report by stating that I appreciate all your efforts, keep up the good work!
opened by b-kamphorst 0
Test Python 3.11 on Linux, Windows and Mac, add 3.12-dev
Update tests for 3.11 and 3.12-dev. Update dependencies since flake8 was not compatible with Python 3.12.

Only after making the changes, I noticed there was already #329. Since there has been no release since, I anyway create this pull request in hopes it'll help get the release with 3.11 support sooner rather than later.

fixes #327

Checklist:

[ ] Add tests that demonstrate the correct behavior of the change. Tests should fail without the change.

[ ] Add or update relevant docs, in the docs folder and in code.

[X] Add an entry in CHANGES.rst summarizing the change and linking to the issue.

[ ] Add .. versionchanged:: entries in any relevant code docs.

[X] Run pre-commit hooks and fix any issues.

[X] Run pytest and tox, no tests failed.
opened by kohtala 0
release 2.1.2

Can a 2.1.2 release be made?

The last release 2.1.1 in March had a regression, and it has been fixed but not released. https://github.com/pallets/markupsafe/blob/main/CHANGES.rst

This is particularly troublesome since Werkzeug depends on MarkupSafe>=2.1.1 so it pulls in a version of Markupsafe with a regression.

Thanks

opened by brondsem 0

Releases(2.1.1)

2.1.1(Mar 15, 2022)
Changes: https://markupsafe.palletsprojects.com/en/2.1.x/changes/#version-2-1-1

Milestone: https://github.com/pallets/markupsafe/milestone/7?closed=1

Source code(tar.gz)
Source code(zip)
2.1.0(Feb 18, 2022)
Changes: https://markupsafe.palletsprojects.com/en/2.1.x/changes/#version-2-1-0

Milestone: https://github.com/pallets/markupsafe/milestone/5

Source code(tar.gz)
Source code(zip)
2.0.1(May 18, 2021)
Changes: https://markupsafe.palletsprojects.com/en/2.0.x/changes/#version-2-0-1

Source code(tar.gz)
Source code(zip)
2.0.0(May 12, 2021)
New major versions of all the core Pallets libraries, including MarkupSafe 2.0, have been released! :tada:

Read the announcement on our blog: https://palletsprojects.com/blog/flask-2-0-released/

Read the full list of changes: https://markupsafe.palletsprojects.com/changes/#version-2-0-0

Retweet the announcement on Twitter: https://twitter.com/PalletsTeam/status/1392266507296514048

Follow our blog, Twitter, or GitHub to see future announcements.

This represents a significant amount of work, and there are quite a few changes. Be sure to carefully read the changelog, and use tools such as pip-compile and Dependabot to pin your dependencies and control your updates.
Source code(tar.gz)
Source code(zip)
2.0.0rc2(Apr 16, 2021)
Changes: https://markupsafe.palletsprojects.com/en/master/changes/#version-2-0-0

Source code(tar.gz)
Source code(zip)

Owner

The Pallets Projects

GitHub https://markupsafe.palletsprojects.com

A HTML-code compiler-thing that lets you reuse HTML code.

RHTML RHTML stands for Reusable-Hyper-Text-Markup-Language, and is pronounced "Rech-tee-em-el" despite how its abbreviation is. As the name stands, RH

4 Nov 15, 2021

That project takes as input special TXT File, divides its content into lsit of HTML objects and then creates HTML file from them.

1 Jan 10, 2022

Lektor-html-pretify - Lektor plugin to pretify the HTML DOM using Beautiful Soup

html-pretify Lektor plugin to pretify the HTML DOM using Beautiful Soup. How doe

2 Nov 8, 2022

Converts XML to Python objects

untangle Documentation Converts XML to a Python object. Siblings with similar names are grouped into a list. Children can be accessed with parent.chil

567 Nov 30, 2022

Python module that makes working with XML feel like you are working with JSON

xmltodict xmltodict is a Python module that makes working with XML feel like you are working with JSON, as in this "spec": >>> print(json.dumps(xmltod

5k Jan 4, 2023

The lxml XML toolkit for Python

What is lxml? lxml is the most feature-rich and easy-to-use library for processing XML and HTML in the Python language. It's also very fast and memory

2.3k Jan 2, 2023

Standards-compliant library for parsing and serializing HTML documents and fragments in Python

html5lib html5lib is a pure-python library for parsing HTML. It is designed to conform to the WHATWG HTML specification, as is implemented by all majo

1k Dec 27, 2022

Pythonic HTML Parsing for Humans™

Requests-HTML: HTML Parsing for Humans™ This library intends to make parsing HTML (e.g. scraping the web) as simple and intuitive as possible. When us

12.9k Jan 1, 2023

A library for converting HTML into PDFs using ReportLab

XHTML2PDF The current release of xhtml2pdf is xhtml2pdf 0.2.5. Release Notes can be found here: Release Notes As with all open-source software, its us

2k Dec 27, 2022

Generate HTML using python 3 with an API that follows the DOM standard specfication.

Generate HTML using python 3 with an API that follows the DOM standard specfication. A JavaScript API and tons of cool features. Can be used as a fast prototyping tool.

114 Dec 14, 2022

A python HTML builder library.

PyML A python HTML builder library. Goals Fully functional html builder similar to the javascript node manipulation. Implement an html parser that ret

8 Jul 4, 2022

Modded MD conversion to HTML

MDPortal A module to convert a md-eqsue lang to html Basically I ruined md in an attempt to convert it to html Overview Here is a demo file from parse

1 Nov 27, 2021

Dominate is a Python library for creating and manipulating HTML documents using an elegant DOM API

Dominate Dominate is a Python library for creating and manipulating HTML documents using an elegant DOM API. It allows you to write HTML pages in pure

1.5k Jan 9, 2023

Markup is an online annotation tool that can be used to transform unstructured documents into structured formats for NLP and ML tasks, such as named-entity recognition. Markup learns as you annotate in order to predict and suggest complex annotations. Markup also provides integrated access to existing and custom ontologies, enabling the prediction and suggestion of ontology mappings based on the text you're annotating.

Markup is an online annotation tool that can be used to transform unstructured documents into structured formats for NLP and ML tasks, such as named-entity recognition. Markup learns as you annotate in order to predict and suggest complex annotations. Markup also provides integrated access to existing and custom ontologies, enabling the prediction and suggestion of ontology mappings based on the text you're annotating.

146 Dec 18, 2022

Safely pass trusted data to untrusted environments and back.

ItsDangerous ... so better sign this Various helpers to pass data to untrusted environments and to get it back safe and sound. Data is cryptographical

2.6k Jan 1, 2023

Bleach is an allowed-list-based HTML sanitizing library that escapes or strips markup and attributes

Bleach Bleach is an allowed-list-based HTML sanitizing library that escapes or strips markup and attributes. Bleach can also linkify text safely, appl

2.5k Dec 29, 2022

Extract embedded metadata from HTML markup

extruct extruct is a library for extracting embedded metadata from HTML markup. Currently, extruct supports: W3C's HTML Microdata embedded JSON-LD Mic

725 Jan 3, 2023

Bleach is an allowed-list-based HTML sanitizing library that escapes or strips markup and attributes

Bleach Bleach is an allowed-list-based HTML sanitizing library that escapes or strips markup and attributes. Bleach can also linkify text safely, appl

2.5k Dec 29, 2022

Blender add-on: Add to Cameras menu: View → Camera, View → Add Camera, Camera → View, Previous Camera, Next Camera

Blender add-on: Camera additions In 3D view, it adds these actions to the View|Cameras menu: View → Camera : set the current camera to the 3D view Vie

11 Feb 8, 2022

Discord Bot that leverages the idea of nested containers using podman, runs untrusted user input, executes Quantum Circuits, allows users to refer to the Qiskit Documentation, and provides the ability to search questions on the Quantum Computing StackExchange.

Discord Bot that leverages the idea of nested containers using podman, runs untrusted user input, executes Quantum Circuits, allows users to refer to the Qiskit Documentation, and provides the ability to search questions on the Quantum Computing StackExchange.

23 Oct 18, 2022

Safely add untrusted strings to HTML/XML markup.

Related tags

Overview

MarkupSafe

Installing

Examples

Donate

Links

Comments

Description

Setup

Expected output

Environment

Releases(2.1.1)

2.1.1(Mar 15, 2022)

2.1.0(Feb 18, 2022)

2.0.1(May 18, 2021)

2.0.0(May 12, 2021)

2.0.0rc2(Apr 16, 2021)

Owner

The Pallets Projects

A HTML-code compiler-thing that lets you reuse HTML code.

That project takes as input special TXT File, divides its content into lsit of HTML objects and then creates HTML file from them.

Lektor-html-pretify - Lektor plugin to pretify the HTML DOM using Beautiful Soup

Converts XML to Python objects

Python module that makes working with XML feel like you are working with JSON

The lxml XML toolkit for Python

Standards-compliant library for parsing and serializing HTML documents and fragments in Python

Pythonic HTML Parsing for Humans™

A library for converting HTML into PDFs using ReportLab

Generate HTML using python 3 with an API that follows the DOM standard specfication.

A python HTML builder library.

Modded MD conversion to HTML

Dominate is a Python library for creating and manipulating HTML documents using an elegant DOM API

Safely pass trusted data to untrusted environments and back.

Bleach is an allowed-list-based HTML sanitizing library that escapes or strips markup and attributes

Extract embedded metadata from HTML markup

Bleach is an allowed-list-based HTML sanitizing library that escapes or strips markup and attributes

Blender add-on: Add to Cameras menu: View → Camera, View → Add Camera, Camera → View, Previous Camera, Next Camera

Discord Bot that leverages the idea of nested containers using podman, runs untrusted user input, executes Quantum Circuits, allows users to refer to the Qiskit Documentation, and provides the ability to search questions on the Quantum Computing StackExchange.