HTML minifier for Python frameworks (not only Django, despite the name).

Overview

django-htmlmin

django-html is an HTML minifier for Python, with full support for HTML 5. It supports Django, Flask and many other Python web frameworks. It also provides a command line tool, that can be used for static websites or deployment scripts.

Why minify HTML code?

One of the important points on client side optimization is to minify HTML. With minified HTML code, you reduce the size of the data transferred from the server to the client, which results in faster load times.

Installing

To install django-htmlmin, run this on the terminal: :

$ [sudo] pip install django-htmlmin

Using the middleware

All you need to do is add two middlewares to your MIDDLEWARE_CLASSES and enable the HTML_MINIFY setting:

MIDDLEWARE_CLASSES = (
    # other middleware classes
    'htmlmin.middleware.HtmlMinifyMiddleware',
    'htmlmin.middleware.MarkRequestMiddleware',
)

Note that if you're using Django's caching middleware, MarkRequestMiddleware should go after FetchFromCacheMiddleware, and HtmlMinifyMiddleware should go after UpdateCacheMiddleware:

MIDDLEWARE_CLASSES = (
    'django.middleware.cache.UpdateCacheMiddleware',
    'htmlmin.middleware.HtmlMinifyMiddleware',
    # other middleware classes
    'django.middleware.cache.FetchFromCacheMiddleware',
    'htmlmin.middleware.MarkRequestMiddleware',
)

You can optionally specify the HTML_MINIFY setting:

HTML_MINIFY = True

The default value for the HTML_MINIFY setting is not DEBUG. You only need to set it to True if you want to minify your HTML code when DEBUG is enabled.

Excluding some URLs

If you don't want to minify all views in your app and it's under a /my_app URL, you can tell the middleware to not minify the response of your views by adding a EXCLUDE_FROM_MINIFYING setting on your settings.py:

EXCLUDE_FROM_MINIFYING = ('^my_app/', '^admin/')

Regex patterns are used for URL exclusion. If you want to exclude all URLs of your app, except a specific view, you can use the decorator @minified_response (check the next section above).

Keeping comments

The default behaviour of the middleware is to remove all HTML comments. If you want to keep the comments, set the setting KEEP_COMMENTS_ON_MINIFYING to True:

KEEP_COMMENTS_ON_MINIFYING = True

Conservative whitespace minifying

By default the minifier will try to intelligently remove whitespace and leave spaces only as needed for inline text rendering. Sometimes it may be necessary to not completely remove whitespace but only reduce spaces to a single space. If you set CONSERVATIVE_WHITESPACE_ON_MINIFYING to False then whitespace is always reduced to a single space and never completely removed.

CONSERVATIVE_WHITESPACE_ON_MINIFYING = True

Using the decorator

django-htmlmin also provides a decorator, that you can use only on views you want to minify the response:

from htmlmin.decorators import minified_response

@minified_response
def home(request):
    return render_to_response('home.html')

Decorator to avoid response to be minified

You can use the not_minified_response decorator on views if you want to avoid the minification of any specific response, without using the EXCLUDE_FROM_MINIFYING setting:

from htmlmin.decorators import not_minified_response

@not_minified_response
def home(request):
    return render_to_response('home.html')

Using the html_minify function

If you are not working with Django, you can invoke the html_minify function manually:

from htmlmin.minify import html_minify
html = '<html>    <body>Hello world</body>    </html>'
minified_html = html_minify(html)

Here is an example with a Flask view:

from flask import Flask
from htmlmin.minify import html_minify

app = Flask(__name__)

@app.route('/')
def home():
    rendered_html = render_template('home.html')
    return html_minify(rendered_html)

Keeping comments

By default, html_minify() removes all comments. If you want to keep them, you can pass ignore_comments=False:

from htmlmin.minify import html_minify
html = '<html>  <body>Hello world<!-- comment to keep --></body>  </html>'
minified_html = html_minify(html, ignore_comments=False)

Using command line tool

If you are not even using Python, you can use the pyminify command line tool to minify HTML files:

$ pyminify index.html > index_minified.html

You can also keep the comments, if you want:

$ pyminify --keep-comments index.html > index_minified_with_comments.html

development

Pull requests are very welcome! Make sure your patches are well tested.

Running tests

If you are using a virtualenv, all you need to do is:

$ make test

community

IRC channel

#cobrateam channel on irc.freenode.net

Changelog

You can see the complete changelog on the Github releases page.

LICENSE

Unless otherwise noted, the django-htmlmin source files are distributed under the BSD-style license found in the LICENSE file.

Comments
  • How to cache minified pages?

    How to cache minified pages?

    Hi, I'd love to use HtmlMinifyMiddleware, but I can't figure out where to put it in my settings.MIDDLEWARE so that it does its job, but is then also cached, so that we don't have to minify again until the page changes or the cache expires. As it is now, when I disable HtmlMinifyMiddleware, my server can handle >60X more requests per second, which suggests that, when it is enabled, HtmlMinifyMiddleware is doing its job for each request. I'd like it to do its job only when the page is rewritten to memcache via UpdateCacheMiddleware/FetchFromCacheMiddleware.

    My settings look a bit like this:

    MIDDLEWARE = (
        'django.middleware.cache.UpdateCacheMiddleware',
        'htmlmin.middleware.HtmlMinifyMiddleware',
        ...  # Other middleware
        'django.middleware.cache.FetchFromCacheMiddleware',
        'django.contrib.redirects.middleware.RedirectFallbackMiddleware',
    )
    

    My reasoning (which could easily be misguided), is that minification should be the last thing to happen to a response body before it gets stored in the cache, hence, it should be the last thing before UpdateCacheMiddleware on the way "out" as a response, therefore the first thing after UpdateCacheMiddleware in the MIDDLEWARE tuple.

    I'm using Python 2.7 and Django 1.4, if that is relavant.

    What am I doing wrong? Also, whatever the answer, it would make a good addition to the installation guide.

    opened by mkoistinen 20
  • < and > get converted to < and > respectively

    < and > get converted to < and > respectively

    I have this snippet of an html page::

    <div class="highlight"><pre><span class="cp">&lt;!DOCTYPE html&gt;</span>
    <span class="nt">&lt;html</span> <span class="na">lang=</span><span class="s">&quot;en&quot;</span> <span class="na">dir=</span><span class="s">&quot;ltr&quot;</span><span class="nt">&gt;</span>
        <span class="nt">&lt;head&gt;</span>
            <span class="nt">&lt;title&gt;</span>My Site!<span class="nt">&lt;/title&gt;</span>
        <span class="nt">&lt;/head&gt;</span>
        <span class="nt">&lt;body&gt;</span>
            <span class="nt">&lt;div</span> <span class="na">id=</span><span class="s">&quot;pagebody&quot;</span><span class="nt">&gt;</span>
                %include
            <span class="nt">&lt;/div&gt;</span>
        <span class="nt">&lt;/body&gt;</span>
    <span class="nt">&lt;/html&gt;</span>
    </pre></div>
    

    When I try to run this through html_minify() I will wind up with the < and > pieces of text being treated as html tags. I think the problem with that is rather obvious, but if I didn't make sense, let me know and I'll try to explain better.

    opened by MTecknology 8
  • UnicodeDecodeError

    UnicodeDecodeError

    I think I've found another bug. I have a TextField in a model, and in the admin site, when I input a latin character, for example: á, after I submit the change_form and try to edit that object I get this error.

    Here's the traceback:

    Django Version: 1.3 Python Version: 2.6.1

    Traceback: File "/Users/jonito/mingus/proyectos/galerias-belgrano/bootstrap/lib/python2.6/site-packages/django/core/handlers/base.py" in get_response

    1.             response = middleware_method(request, response)
      
      File "/Users/jonito/mingus/proyectos/galerias-belgrano/bootstrap/src/django-htmlmin/htmlmin/middleware.py" in process_response
    2.         response.content = html_minify(response.content)
      
      File "/Users/jonito/mingus/proyectos/galerias-belgrano/bootstrap/src/django-htmlmin/htmlmin/minify.py" in html_minify
    3.         html_code = html_code.replace(script, TAGS_PATTERN % (tag, index))
      

    Exception Type: UnicodeDecodeError at /admin/chunks/chunk/2/ Exception Value: ('ascii', '', 94, 95, 'ordinal not in range(128)')

    Bug 
    opened by honi 8
  • Support the new Django 1.10 MIDDLEWARE style

    Support the new Django 1.10 MIDDLEWARE style

    I get a TypeError when using the django-htmlmin middlewares in MIDDLEWARE in Django 1.10 (they're fine when using the old-style MIDDLEWARE_CLASSES).

    See https://docs.djangoproject.com/en/1.10/topics/http/middleware/#upgrading-pre-django-1-10-style-middleware for upgrading information.

    Bug 
    opened by tremby 7
  • ignore option

    ignore option

    hey, thanks for the module. i just tried your app, and it looks awesome as it is already.

    i'm using grappelly admin interface and it looks kinda wierd with your middleware. the selectors loose their default height and become oneliners. it's very uneasy when you have the list with lots of elements.

    so i would kindly feature request for some decorator or optional ignore settings. in my case i would like to ignore the whole admin interface.

    anyway, thanks for your job.

    Bug 
    opened by fuxter 7
  • error with gzip on nginx

    error with gzip on nginx

    If I try add gzip on in nginx.conf, PageSpeed Insights return

    The server closed the connection before sending a full response. Ensure that the page loads in a browser and try again

    Bug 
    opened by vladimirmyshkovski 6
  • Too aggressive?

    Too aggressive?

    This seems sort of wrong to me (using the latest release):

    >>> str= "<b>hey </b>you"                                                                                                      
    >>> html_minify(str)
    u'<html><head></head><body><b>hey</b>you</body></html>'
    >>> 
    

    which would render like "heyyou", when the intent would pretty clearly be "hey you"

    Bug 
    opened by rosskarchner 6
  • change html_minify to recursive function

    change html_minify to recursive function

    As per discussions in #21, I implemented a minify version using recursive functions to walk and clean the tree. This method is, I think, more robust and happens to be faster.

    opened by hrbonz 6
  • Spaces removed around <a> tags in text

    Spaces removed around tags in text

    I have a text of the form:

        Lorem ipsum dolor sit amet....
        <a href="#">Ut enim</a> ad minim veniam
    

    When minified the spaces around the link tags are removed, causing the linked words to cram into the surrounding text. This is with django-htmlmin 0.5.2.

    I had to use non-breaking space around the links to workaround this issue.

    Bug 
    opened by atodorov 6
  • Add CONSERVATIVE_WHITESPACE_ON_MINIFYING to retain inline text spaces

    Add CONSERVATIVE_WHITESPACE_ON_MINIFYING to retain inline text spaces

    I was looking into the issue outlined in #121 and #21 as it's affecting me. Essentially the current system of completely removing whitespace in a text flow has some bugs and it's not easy to fix. So, in an effort to at least get things working I've added a CONSERVATIVE_WHITESPACE_ON_MINIFYING option to turn off eagerly removing whitespace and always leave at least 1 space (I'm also hoping that feature may be useful to someone else for other purposes). The default is to continue working as before for backwards compatibility.

    The slightly longer explanation

    If you try minifying a <i>b</i> it will properly reduce to a <i>b</i> and the display output in a browser will be "a b" (note the space). However, if you try minifying a<i> b</i> it will reduce to a<i>b</i> and then get displayed as "ab" (note, no space). Ideally, the 2nd case should end up being a <i>b<i> to retain that inline space but at the right spot.

    Now, I tried going through the code to see if I could fix that 2nd case, but I'm not sure I see a way to get there without some major restructuring. One issue is that when examining the NavigableString of ' b' in is_inflow, the previous_sibling is None (in the above example). BS4 doesn't treat a tag next to a text block as siblings.

    I did find another project that seemed to do a good job of this here: https://github.com/kangax/html-minifier/blob/51ce10f4daedb1de483ffbcccecc41be1c873da2/src/htmlminifier.js#L65-L83 However, that's in JS using completely different libraries and would require quite a bit of time to rework with BS4. That project also had a "conservative whitespace" option and I thought it might be a good feature to add and also provide a workaround for this issue until a better solution is developed.

    One last point... Any solution essentially assumes that inline/inline-block HTML elements will remain such, but CSS could easily break things if someone were to do something like <div style="display:inline">. Obviously that's a silly thing to do, but this change allows for things like that to work as expected. Sometimes HTML on the page comes from third party libraries and it's not always possible to fix these things to have proper HTML.

    opened by tisdall 5
  • Added support for Python 3 and 2

    Added support for Python 3 and 2

    I've refactored the unicode literals so that it supports both Python 2 and Python 3. I used 'six' just to keep the codebase a little cleaner, I hope that's okay.

    However, there are some known issues, which I hope you guys know how to fix:

    Even though it passes the tests on Python 2.x, it fails to even run the tests on Python 3. This is because nosedjango does not support Python 3, which causes Travis to stop the builds on Python 3.x. Though our code should (theoretically) work on Python 3 now.

    Is there any idea on how we can fix that? The guys over at nosedjango seems pretty dead.

    opened by drcd 5
  • Bump actions/setup-python from 4.3.0 to 4.4.0

    Bump actions/setup-python from 4.3.0 to 4.4.0

    Bumps actions/setup-python from 4.3.0 to 4.4.0.

    Release notes

    Sourced from actions/setup-python's releases.

    Add support to install multiple python versions

    In scope of this release we added support to install multiple python versions. For this you can try to use this snippet:

        - uses: actions/setup-python@v4
          with:
            python-version: |
                3.8
                3.9
                3.10
    

    Besides, we changed logic with throwing the error for GHES if cache is unavailable to warn (actions/setup-python#566).

    Improve error handling and messages

    In scope of this release we added improved error message to put operating system and its version in the logs (actions/setup-python#559). Besides, the release

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 0
  • Bump actions/checkout from 3.1.0 to 3.2.0

    Bump actions/checkout from 3.1.0 to 3.2.0

    Bumps actions/checkout from 3.1.0 to 3.2.0.

    Release notes

    Sourced from actions/checkout's releases.

    v3.2.0

    What's Changed

    New Contributors

    Full Changelog: https://github.com/actions/checkout/compare/v3...v3.2.0

    Changelog

    Sourced from actions/checkout's changelog.

    Changelog

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 0
  • No wheel available for 0.11.0 on PyPI

    No wheel available for 0.11.0 on PyPI

    Hello!

    Only the source tarball is available on PyPI for version 0.11.0 (https://pypi.org/project/django-htmlmin/0.11.0/#files). Would it be possible to get a wheel as well?

    opened by Steap 0
  • need to retain space around inline, inline-block, and comment

    need to retain space around inline, inline-block, and comment

    Currently django-htmlmin retains space around inline text elements. However, there are other inline/inline-block elements that still render whitespace around them that are currently not handled. Also, comment removal takes spaces outside the comment with it.

    examples:

    >>> from htmlmin.minify import html_minify
    >>> html_minify('''<img src="http://example.com/image.jpg" alt="example">  hi''')
    '<html><head></head><body><img alt="example" src="http://example.com/image.jpg"/>hi</body></html>'
    >>> html_minify('''name: <input type="text">''')
    '<html><head></head><body>name:<input type="text"/></body></html>'
    >>> html_minify('''<p>a <!-- --> b</p>''')
    '<html><head></head><body><p>ab</p></body></html>'
    >>> html_minify('''<em>a <!-- --> b</em>''')  # this works as `em` is a text context
    '<html><head></head><body><em>a  b</em></body></html>'
    >>> html_minify('''<button> click me  </button>  text after''')
    '<html><head></head><body><button>click me</button>text after</body></html>'
    

    Some of these can be fixed by adding attributes to TEXT_FLOW, but they aren't really text elements so naming would probably need adjusting.

    opened by tisdall 0
  • problem using html_minify in django

    problem using html_minify in django

    Hi, thank you for doing this project!

    I am trying to manually minify a view in a django project, but I don't have it integrated to the middleware (other views are already taken care of and have a build process) But I expected to be able to call html_minify on the output of a call to django.shortcuts.render. Instead I get:

    Traceback (most recent call last):
      File "/usr/local/lib/python3.8/site-packages/django/core/handlers/exception.py", line 34, in inner
        response = get_response(request)
      File "/usr/local/lib/python3.8/site-packages/debug_toolbar/middleware.py", line 67, in __call__
        panel.generate_stats(request, response)
      File "/usr/local/lib/python3.8/site-packages/debug_toolbar/panels/headers.py", line 53, in generate_stats
        self.response_headers = OrderedDict(sorted(response.items()))
    
    Exception Type: AttributeError at /v/27/
    Exception Value: 'str' object has no attribute 'items'
    

    Maybe there is something in newer django? Or is there some intermediate step in this possibly unusal workflow?

    opened by chrisamow 0
Releases(0.11.0)
Pipeline is an asset packaging library for Django.

Pipeline Pipeline is an asset packaging library for Django, providing both CSS and JavaScript concatenation and compression, built-in JavaScript templ

Jazzband 1.4k Aug 29, 2021
django-systemjs

Django SystemJS Django SystemJS brings the Javascript of tomorrow to Django, today. It leverages JSPM (https://jspm.io) to do the heavy lifting for yo

Sergei Maertens 42 Jan 11, 2022
Transparently use webpack with django

django-webpack-loader Read http://owaislone.org/blog/webpack-plus-reactjs-and-django/ for a detailed step by step guide on setting up webpack with dja

null 2.4k Jan 6, 2023
Python bindings to webpack

Unmaintained This project is unmaintained as it's a complicated solution to a simple problem. You should try using either https://github.com/owais/dja

Mark Finger 62 Apr 15, 2022
Use minify-html, the extremely fast HTML + JS + CSS minifier, with Django.

django-minify-html Use minify-html, the extremely fast HTML + JS + CSS minifier, with Django. Requirements Python 3.8 to 3.10 supported. Django 2.2 to

Adam Johnson 60 Dec 28, 2022
CONTRIBUTIONS ONLY: Voluptuous, despite the name, is a Python data validation library.

CONTRIBUTIONS ONLY What does this mean? I do not have time to fix issues myself. The only way fixes or new features will be added is by people submitt

Alec Thomas 1.8k Dec 31, 2022
Django-Text-to-HTML-converter - The simple Text to HTML Converter using Django framework

Django-Text-to-HTML-converter This is the simple Text to HTML Converter using Dj

Nikit Singh Kanyal 6 Oct 9, 2022
this keylogger is only for pc not for android but it will only work on those pc who have python installed it is made for all linux,windows and macos

Keylogger this keylogger is only for pc not for android but it will only work on those pc who have python installed it is made for all linux,windows a

Titan_Exodous 1 Nov 4, 2021
Face2webtoon - Despite its importance, there are few previous works applying I2I translation to webtoon.

Despite its importance, there are few previous works applying I2I translation to webtoon. I collected dataset from naver webtoon 연애혁명 and tried to transfer human faces to webtoon domain.

이상윤 64 Oct 19, 2022
Project created to help beginner programmers to study, despite the lack of internet!

Project created to help beginner programmers to study, despite the lack of internet!

Dev4Dev 2 Oct 25, 2021
Simple reuse of partial HTML page templates in the Jinja template language for Python web frameworks.

Jinja Partials Simple reuse of partial HTML page templates in the Jinja template language for Python web frameworks. (There is also a Pyramid/Chameleo

Michael Kennedy 106 Dec 28, 2022
An extremely fast JavaScript and CSS bundler and minifier

Website | Getting started | Documentation | Plugins | FAQ Why? Our current build tools for the web are 10-100x slower than they could be: The main goa

Evan Wallace 34.2k Jan 4, 2023
Automatically compile an AWS Service Control Policy that ONLY allows AWS services that are compliant with your preferred compliance frameworks.

aws-allowlister Automatically compile an AWS Service Control Policy that ONLY allows AWS services that are compliant with your preferred compliance fr

Salesforce 189 Dec 8, 2022
A HTML-code compiler-thing that lets you reuse HTML code.

RHTML RHTML stands for Reusable-Hyper-Text-Markup-Language, and is pronounced "Rech-tee-em-el" despite how its abbreviation is. As the name stands, RH

Duckie 4 Nov 15, 2021
That project takes as input special TXT File, divides its content into lsit of HTML objects and then creates HTML file from them.

That project takes as input special TXT File, divides its content into lsit of HTML objects and then creates HTML file from them.

null 1 Jan 10, 2022
Lektor-html-pretify - Lektor plugin to pretify the HTML DOM using Beautiful Soup

html-pretify Lektor plugin to pretify the HTML DOM using Beautiful Soup. How doe

Chaos Bodensee 2 Nov 8, 2022
A friendly library for parsing HTTP request arguments, with built-in support for popular web frameworks, including Flask, Django, Bottle, Tornado, Pyramid, webapp2, Falcon, and aiohttp.

webargs Homepage: https://webargs.readthedocs.io/ webargs is a Python library for parsing and validating HTTP request objects, with built-in support f

marshmallow-code 1.3k Jan 1, 2023
Tweak the form field rendering in templates, not in python-level form definitions. CSS classes and HTML attributes can be altered.

django-widget-tweaks Tweak the form field rendering in templates, not in python-level form definitions. Altering CSS classes and HTML attributes is su

Jazzband 1.8k Jan 2, 2023
Tweak the form field rendering in templates, not in python-level form definitions. CSS classes and HTML attributes can be altered.

django-widget-tweaks Tweak the form field rendering in templates, not in python-level form definitions. Altering CSS classes and HTML attributes is su

Jazzband 1.8k Jan 6, 2023
A Django app that allows you to send email asynchronously in Django. Supports HTML email, database backed templates and logging.

Django Post Office Django Post Office is a simple app to send and manage your emails in Django. Some awesome features are: Allows you to send email as

User Inspired 856 Dec 25, 2022