xsendfile etc wrapper

Overview

Django Sendfile

This is a wrapper around web-server specific methods for sending files to web clients. This is useful when Django needs to check permissions associated files, but does not want to serve the actual bytes of the file itself. i.e. as serving large files is not what Django is made for.

Note this should not be used for regular file serving (e.g. css etc), only for cases where you need Django to do some work before serving the actual file.

The interface is a single function sendfile(request, filename, attachment=False, attachment_filename=None), which returns a HTTPResponse object.

from sendfile import sendfile

# send myfile.pdf to user
return sendfile(request, '/home/john/myfile.pdf')

# send myfile.pdf as an attachment (with name myfile.pdf)
return sendfile(request, '/home/john/myfile.pdf', attachment=True)

# send myfile.pdf as an attachment with a different name
return sendfile(request, '/home/john/myfile.pdf', attachment=True, attachment_filename='full-name.pdf')

Backends are specified using the setting SENDFILE_BACKEND. Currenly available backends are:

  • sendfile.backends.development - for use with django development server only. DO NOT USE IN PRODUCTION
  • sendfile.backends.simple - "simple" backend that uses Django file objects to attempt to stream files from disk (note middleware may cause files to be loaded fully into memory)
  • sendfile.backends.xsendfile - sets X-Sendfile header (as used by mod_xsendfile/apache and lighthttpd)
  • sendfile.backends.mod_wsgi - sets Location with 200 code to trigger internal redirect (daemon mode mod_wsgi only - see below)
  • sendfile.backends.nginx - sets X-Accel-Redirect header to trigger internal redirect to file

If you want to write your own backend simply create a module with a sendfile function matching:

def sendfile(request, filename):
    '''Return HttpResponse object for serving file'''

Then specify the full path to the module in SENDFILE_BACKEND. You only need to implement the sending of the file. Adding the content-disposition headers etc is done elsewhere.

Development backend

The Development backend is only meant for use while writing code. It uses Django's static file serving code to do the job, which is only meant for development. It reads the whole file into memory and the sends it down the wire - not good for big files, but ok when you are just testing things out.

It will work with the Django dev server and anywhere else you can run Django.

Simple backend

This backend is one step up from the development backend. It uses Django's django.core.files.base.File class to try and stream files from disk. However some middleware (e.g. GzipMiddleware) that rewrites content will causes the entire file to be loaded into memory. So only use this backend if you are not using middleware that rewrites content or you only have very small files.

xsendfile backend

Install either mod_xsendfile in Apache or use Lighthttpd. You may need to configure mod_xsendfile, but that should be as simple as:

XSendFile On

In your virtualhost file/conf file.

mod_wsgi backend

The mod_wsgi backend will only work when using mod_wsgi in daemon mode, not in embedded mode. It requires a bit more work to get it to do the same job as xsendfile though. However some may find it easier to setup, as they don't need to compile and install mod_xsendfile.

Firstly there are two more django settings:

  • SENDFILE_ROOT - this is a directoy where all files that will be used with sendfile must be located
  • SENDFILE_URL - internal URL prefix for all files served via sendfile

These settings are needed as this backend makes mod_wsgi send an internal redirect, so we have to convert a file path into a URL. This means that the files are visible via Apache by default too. So we need to get Apache to hide those files from anything that's not an internal redirect. To so this we can use some mod_rewrite_ magic along these lines:

RewriteEngine On
# see if we're on an internal redirect or not
RewriteCond %{THE_REQUEST} ^[\S]+\ /private/
RewriteRule ^/private/ - [F]

Alias /private/ /home/john/Development/myapp/private/
<Directory /home/john/Development/myapp/private/>
    Order deny,allow
    Allow from all
</Directory>

In this case I have also set:

SENDFILE_ROOT = '/home/john/Development/myapp/private/'
SENDFILE_URL = '/private'

All files are stored in a folder called 'private'. We forbid access to this folder (RewriteRule ^/private/ - [F]) if someone tries to access it directly (RewriteCond %{THE_REQUEST} ^[S]+/private/) by checking the original request (THE_REQUEST).

Alledgedly IS_SUBREQ can be used to perform the same job, but I was unable to get this working.

Nginx backend

As with the mod_wsgi backend you need to set two extra settings:

  • SENDFILE_ROOT - this is a directoy where all files that will be used with sendfile must be located
  • SENDFILE_URL - internal URL prefix for all files served via sendfile

You then need to configure nginx to only allow internal access to the files you wish to serve. More details on this are here.

For example though, if I use the django settings:

SENDFILE_ROOT = '/home/john/Development/django-sendfile/examples/protected_downloads/protected'
SENDFILE_URL = '/protected'

Then the matching location block in nginx.conf would be:

location /protected/ {
  internal;
  root   /home/john/Development/django-sendfile/examples/protected_downloads;
}

You need to pay attention to whether you have trailing slashes or not on the SENDFILE_URL and root values, otherwise you may not get the right URL being sent to NGINX and you may get 404s. You should be able to see what file NGINX is trying to load in the error.log if this happens. From there it should be fairly easy to work out what the right settings are.

Comments
  • Requested URL

    Requested URL

    Hello,

    I want to apologize first if this will be simple problem, learning django and was trying your app. Was able to work out all of it except the last part.

    I can load image or file to Django Admin and the url of the image is: download/6.jpg

    when I click it will get an error:

    http://cl.ly/image/0P420L0f3l1M

    Even if I click in download list when files are public same error, probably I need to add that URL "protected/download" to urls.py but I would need to create a view then no?

    Can you tell me the proper way to setup your example?

    Thanks

    opened by python-force 17
  • Only provide attachment filename explicitly, to allow unicode filenames.

    Only provide attachment filename explicitly, to allow unicode filenames.

    The Content-Disposition header can be used without passing in a filename parameter: in that case, the filename will be the last portion of the url path.

    Leaving off the filename parameter, and pulling the value from the url string automatically allows for unicode strings to be used as filenames: otherwise django specifically prohibits non-ascii strings, which used to be part of the specification (and is the only way that is guaranteed to work in all browsers).

    This patch will only set the parameter if it is explicitly passed in in the sendfile function. It may break existing code, where a reliance on the filename is implicit. This code could be fixed by instead of calling:

     sendfile(request, file, True)
    

    using:

     sendfile(request, file, True, os.path.basename(file))
    
    opened by schinckel 13
  • Added support for non-ascii filenames in content-disposition.

    Added support for non-ascii filenames in content-disposition.

    Using the details outlined at: http://kbyanc.blogspot.hk/2010/07/serving-file-downloads-with-non-ascii.html

    To support older browsers, an ascii only filename is added to the content-dispostion. The unicode version is properly encoded and added to the header. The argument attachment_filename may now be False to send no filename. Added and fixed unit tests to verify the new behavior.

    This change introduces a new dependency on Unidecode https://pypi.python.org/pypi/Unidecode. I have tested this change with Python 2.7.5 and Django 1.6. All unit tests and manual test cases worked for me. Please review the changes. If any other configurations do not work please let me know with version numbers.

    opened by jdufresne 12
  • Added url quoting to X-Accel-Redirect and Location header

    Added url quoting to X-Accel-Redirect and Location header

    This fix is as unintrusive as I could make it. All the tests pass on Python2. As you know many tests fail on Python3.

    I successfully tested this with a Nginx setup on PY2 and PY3. I couldn't test this on Apache2 as I'm really unfamiliar with the setup.

    Is there anything else you need?

    opened by Proper-Job 9
  • mod_wsgi backend broken on Django 1.8

    mod_wsgi backend broken on Django 1.8

    On Django 1.7 and before, the Location header that is returned by the mod_wsgi backend is just the path to the file (eg: /media/documents/mydoc.pdf).

    On Django 1.8, it has http:// appended to the beginning. So the above path has now changed to http:///media/documents/mydoc.pdf.

    opened by kaedroho 9
  • UTF-8 compatibility fix broke serving filenames with spaces

    UTF-8 compatibility fix broke serving filenames with spaces

    Hi all,

    Updating from 0.3.4 to 0.3.6, I had the bad surprise to see that it broke the support of files containing whitespaces.

    The response of sendfile 0.3.4 for such a file was : X-Accel-Redirect: /protected_files/1/files/Backup 2014-09-02-2.sqlbackup Content-length: 70458 Content-Type: application/octet-stream Content-Disposition: attachment; filename="Backup 2014-09-02-2.sqlbackup.zip"

    And now (since 0.3.5) it is: X-Accel-Redirect: /protected_files/1/files/Backup%202014-09-02-2.sqlbackup Content-length: 70458 Content-Type: application/octet-stream Content-Disposition: attachment; filename="Backup 2014-09-02-2.sqlbackup.zip"

    Which makes the download fail...

    Best, Matthieu

    opened by MRigal 9
  • Unicode error

    Unicode error

    Hi, I followed your documentation's suggestion to use django sendfile.

    I'm French and as English persons we have some word with accentuation like a café.

    It's rather common for us to have this kind of letters in file names, but here a stacktrace:

    [my app things]
      File "/home/gunicorn/prod/ama/ama_prod/ama_app/models/site.py", line 186, in get_document
        return sendfile(request, self.file.path)
    
      File "/home/gunicorn/prod/ama/venv/lib/python2.6/site-packages/sendfile/__init__.py", line 59, in sendfile
        response = _sendfile(request, filename, mimetype=mimetype)
    
      File "/home/gunicorn/prod/ama/venv/lib/python2.6/site-packages/sendfile/backends/nginx.py", line 7, in sendfile
        response['X-Accel-Redirect'] = _convert_file_to_url(filename)
    
      File "/home/gunicorn/prod/ama/venv/lib/python2.6/site-packages/django/http/__init__.py", line 612, in __setitem__
        header, value = self._convert_to_ascii(header, value)
    
      File "/home/gunicorn/prod/ama/venv/lib/python2.6/site-packages/django/http/__init__.py", line 601, in _convert_to_ascii
        value = value.encode('us-ascii')
    
    UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 40: ordinal not in range(128), HTTP response headers must be in US-ASCII format
    

    note: my file name is péter_là_gueule.txt

    Using django 1.5 and gunicorn with UTF-8 encoded filesystem.

    opened by Christophe31 7
  • Remove calculation of Content-Length.

    Remove calculation of Content-Length.

    mod_xsendfile unconditionally removes the Content-Length header as it will always be recalculated when serving the passed file. Calculating it in Django is a waste.

    URL to where mod_xsendfile removes this header. See:

    https://github.com/nmaier/mod_xsendfile/blob/1480b2dd27ebe08cc47e84f4666f6df2b28a79b5/mod_xsendfile.c#L396-L397

    opened by jdufresne 5
  • sendfile() accepts file wrapper instead of filename

    sendfile() accepts file wrapper instead of filename

    As of version 0.3.2, sendfile() function takes a filename as input, and checks that this file exists using os.path.exists. See https://github.com/johnsensible/django-sendfile/blob/298f632cb804c864f74dd248773286f889a03e5d/sendfile/init.py#L48 This implementation limits sendfile() usage to files that live on local filesystem.

    What about accepting file wrappers instead of filenames? such as FieldFile (file wrapper for FileField). This would make it possible to serve files from various locations: local filesystem, storages, URL, in-memory files...

    opened by benoitbryon 5
  • No mention on nginx

    No mention on nginx

    Not sure whether this is just a documentation issue (i.e. it's supported by one of the other backends i.e. WSGI) or whether it's unsupported.

    Sorry if it's obvious to others but this area isn't one of my strong points - hence why I'm looking for a nice abstraction library!

    opened by andybak 5
  • Licensing

    Licensing

    sendfile depends on Unidecode that is GPL this means that sendfile and all other software that import it should be GPL too. Since unidecode is used only here:

    https://github.com/johnsensible/django-sendfile/blob/master/sendfile/init.py#L80

    would be possible to remove this dependency?

    opened by drakkan 4
  • Fix `ascii_filename` to work properly in Python 3

    Fix `ascii_filename` to work properly in Python 3

    This fixes issue #49. In Python 3 the result of str.encode() is bytes and so "something %s" % (b'else') =="something b'else'", which2to3does not catch because of thestr-bytes` ambiguity in Python 2 (see also here). I suggest simply reconverting the ASCII-bytes into a Unicode string (containing only ASCII characters) for Python 3. Actually, the version check is redundant, as in Python 2 I think

    s = s.encode('ascii', 'ignore')
    assert s.decode('ascii') == s
    

    Is always true.

    I haven't tested this code.

    opened by kwikwag 0
  • Optimize using the library with Nginx as a proxy to AWS S3

    Optimize using the library with Nginx as a proxy to AWS S3

    Hello,

    Many years ago I made some changes for the sole purpose of integrating this module into both personal and professional projects that are running on Amazon Web Services and using S3 as a backend for the media assets.

    This module helped me to build a secured yet scalable and efficient assets storage by proxyfing S3 with Nginx. The applications are offloading data transfers to Nginx thanks to X-Sendfile header.

    In such a setup, trivial operations such as computing mime type or file size requires unnecessary roundtrips (and data transfer) from S3 to the servers. I investigated the issue and found a neat way to optimize such a setup.

    Basically I had two means to optimize the storage :

    1. Precompute values (e.g. mimetype and size) and store into the application's DB.
    2. Let the backend (Nginx + S3) do their job (return HTTP 404 if file is missing, add Content-Type header, etc).

    Actually there two projects running in production (Django and Flask) with this module (my version of it).

    I would be glad to contribute to this repository and offer the opportunity to other users to use these features ...

    Thank you for reviewing the changes.

    Best Regards,

    David Fischer

    opened by davidfischer-ch 0
  • Django 2.1 and above has removed the permalink() decorator

    Django 2.1 and above has removed the permalink() decorator

    Django 2.1 and above has removed the permalink() decorator.

    examples/protected_downloads/download/models.py contains @permalink

    Just a note, in case you want to update your example application for modern versions of Django.

    opened by dancaron 1
This tool extracts Credit card numbers, NTLM(DCE-RPC, HTTP, SQL, LDAP, etc), Kerberos (AS-REQ Pre-Auth etype 23), HTTP Basic, SNMP, POP, SMTP, FTP, IMAP, etc from a pcap file or from a live interface.

This tool extracts Credit card numbers, NTLM(DCE-RPC, HTTP, SQL, LDAP, etc), Kerberos (AS-REQ Pre-Auth etype 23), HTTP Basic, SNMP, POP, SMTP, FTP, IMAP, etc from a pcap file or from a live interface.

null 1.6k Jan 1, 2023
🚀 An asynchronous python API wrapper meant to replace discord.py - Snappy discord api wrapper written with aiohttp & websockets

Pincer An asynchronous python API wrapper meant to replace discord.py ❗ The package is currently within the planning phase ?? Links |Join the discord

Pincer 125 Dec 26, 2022
A wrapper for slurm especially on Taiwania2 (HPC CLI)A wrapper for slurm especially on Taiwania2 (HPC CLI)

TWCC-slurm-wrapper A wrapper for slurm especially on Taiwania2 (HPC CLI). For Taiwania2 (HPC CLI) usage, please refer to here. (中文) How to Install? gi

Chi-Liang, Liu 5 Oct 7, 2022
Discord-Wrapper - Discord Websocket Wrapper in python

This does not currently work and is in development Discord Websocket Wrapper in

null 3 Oct 25, 2022
Aws-lambda-requests-wrapper - Request/Response wrapper for AWS Lambda with API Gateway

AWS Lambda Requests Wrapper Request/Response wrapper for AWS Lambda with API Gat

null 1 May 20, 2022
nicfit 425 Jan 1, 2023
Inspects Python source files and provides information about type and location of classes, methods etc

prospector About Prospector is a tool to analyse Python code and output information about errors, potential problems, convention violations and comple

Python Code Quality Authority 1.7k Dec 31, 2022
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

EasyOCR Ready-to-use OCR with 80+ languages supported including Chinese, Japanese, Korean and Thai. What's new 1 February 2021 - Version 1.2.3 Add set

Jaided AI 16.7k Jan 3, 2023
Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.

Luigi is a Python (3.6, 3.7 tested) package that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow managemen

Spotify 16.2k Jan 1, 2023
Tools, wrappers, etc... for data science with a concentration on text processing

Rosetta Tools for data science with a focus on text processing. Focuses on "medium data", i.e. data too big to fit into memory but too small to necess

null 207 Nov 22, 2022
Automatically erase objects in the video, such as logo, text, etc.

Video-Auto-Wipe Read English Introduction:Here   本人不定期的基于生成技术制作一些好玩有趣的算法模型,这次带来的作品是“视频擦除”方向的应用模型,它实现的功能是自动感知到视频中我们不想看见的部分(譬如广告、水印、字幕、图标等等)然后进行擦除。由于图标擦

seeprettyface.com 141 Dec 26, 2022
Inspects Python source files and provides information about type and location of classes, methods etc

prospector About Prospector is a tool to analyse Python code and output information about errors, potential problems, convention violations and comple

Python Code Quality Authority 1.7k Dec 31, 2022
spaCy plugin for Transformers , Udify, ELmo, etc.

Camphr - spaCy plugin for Transformers, Udify, Elmo, etc. Camphr is a Natural Language Processing library that helps in seamless integration for a wid

null 342 Nov 21, 2022
An Analysis Toolkit for Natural Language Generation (Translation, Captioning, Summarization, etc.)

VizSeq is a Python toolkit for visual analysis on text generation tasks like machine translation, summarization, image captioning, speech translation

Facebook Research 409 Oct 28, 2022
spaCy plugin for Transformers , Udify, ELmo, etc.

Camphr - spaCy plugin for Transformers, Udify, Elmo, etc. Camphr is a Natural Language Processing library that helps in seamless integration for a wid

null 327 Feb 18, 2021
An Analysis Toolkit for Natural Language Generation (Translation, Captioning, Summarization, etc.)

VizSeq is a Python toolkit for visual analysis on text generation tasks like machine translation, summarization, image captioning, speech translation

Facebook Research 310 Feb 1, 2021
Real-time audio visualizations (spectrum, spectrogram, etc.)

Friture Friture is an application to visualize and analyze live audio data in real-time. Friture displays audio data in several widgets, such as a sco

Timothée Lecomte 700 Dec 31, 2022
🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...

ArchiveBox Open-source self-hosted web archiving. ▶️ Quickstart | Demo | Github | Documentation | Info & Motivation | Community | Roadmap "Your own pe

ArchiveBox 14.8k Jan 5, 2023
Ticket shop application for conferences, festivals, concerts, tech events, shows, exhibitions, workshops, barcamps, etc.

pretix Reinventing ticket presales, one ticket at a time. Project status & release cycle While there is always a lot to do and improve on, pretix by n

pretix 1.3k Jan 1, 2023