Auto-detecting the n+1 queries problem in Python

Overview

nplusone

Latest version Travis-CI Code coverage

nplusone is a library for detecting the n+1 queries problem in Python ORMs, including SQLAlchemy, Peewee, and the Django ORM.

The Problem

Many object-relational mapping (ORM) libraries default to lazy loading for relationships. This pattern can be efficient when related rows are rarely accessed, but quickly becomes inefficient as relationships are accessed more frequently. In these cases, loading related rows eagerly using a JOIN can be vastly more performant. Unfortunately, understanding when to use lazy versus eager loading can be challenging: you might not notice the problem until your app has slowed to a crawl.

nplusone is an ORM profiling tool to help diagnose and improve poor performance caused by inappropriate lazy loading. nplusone monitors applications using Django or SQLAlchemy and sends notifications when potentially expensive lazy loads are emitted. It can identify the offending relationship attribute and specific lines of code behind the problem, and recommend fixes for better performance.

nplusone also detects inappropriate eager loading for Flask-SQLAlchemy and the Django ORM, emitting a warning when related data are eagerly loaded but never accessed within the current request.

Installation

pip install -U nplusone

nplusone supports Python >= 2.7 or >= 3.3.

Usage

Note: nplusone should only be used for development and should not be deployed to production environments.

Django

Note: nplusone supports Django >= 1.8.

Add nplusone to INSTALLED_APPS:

INSTALLED_APPS = (
    ...
    'nplusone.ext.django',
)

Add NPlusOneMiddleware:

MIDDLEWARE = (
    'nplusone.ext.django.NPlusOneMiddleware',
    ...
)

Optionally configure logging settings:

NPLUSONE_LOGGER = logging.getLogger('nplusone')
NPLUSONE_LOG_LEVEL = logging.WARN

Configure logging handlers:

LOGGING = {
    'version': 1,
    'handlers': {
        'console': {
            'class': 'logging.StreamHandler',
        },
    },
    'loggers': {
        'nplusone': {
            'handlers': ['console'],
            'level': 'WARN',
        },
    },
}

When your app loads data lazily, nplusone will emit a log message:

Potential n+1 query detected on `<model>.<field>`

Consider using select_related or prefetch_related in this case.

When your app eagerly loads related data without accessing it, nplusone will log a warning:

Potential unnecessary eager load detected on `<model>.<field>`

Flask-SQLAlchemy

Wrap application with NPlusOne:

from flask import Flask
from nplusone.ext.flask_sqlalchemy import NPlusOne

app = Flask(__name__)
NPlusOne(app)

Optionally configure logging settings:

app = Flask(__name__)
app.config['NPLUSONE_LOGGER'] = logging.getLogger('app.nplusone')
app.config['NPLUSONE_LOG_LEVEL'] = logging.ERROR
NPlusOne(app)

When your app loads data lazily, nplusone will emit a log message:

Potential n+1 query detected on `<model>.<field>`

Consider using subqueryload or joinedload in this case; see SQLAlchemy's guide to relationship loading for complete documentation.

When your app eagerly loads related data without accessing it, nplusone will log a warning:

Potential unnecessary eager load detected on `<model>.<field>`

WSGI

For other frameworks that follow the WSGI specification, wrap your application with NPlusOneMiddleware. You must also import the relevant nplusone extension for your ORM:

import bottle
from nplusone.ext.wsgi import NPlusOneMiddleware
import nplusone.ext.sqlalchemy

app = NPlusOneMiddleware(bottle.app())

Generic

The integrations above are coupled to the request-response cycle. To use nplusone outside the context of an HTTP request, use the Profiler context manager: You must also import the relevant nplusone extension for your ORM:

from nplusone.core import profiler
import nplusone.ext.sqlalchemy

with profiler.Profiler():
    ...

Customizing notifications

By default, nplusone logs all potentially unnecessary queries using a logger named "nplusone". When the NPLUSONE_RAISE configuration option is set, nplusone will also raise an NPlusOneError. This can be used to force all automated tests involving unnecessary queries to fail.

# Django config
NPLUSONE_RAISE = True

# Flask config
app.config['NPLUSONE_RAISE'] = True

The exception type can also be specified, if desired, using the NPLUSONE_ERROR option.

Ignoring notifications

To ignore notifications from nplusone globally, configure the whitelist using the NPLUSONE_WHITELIST option:

# Django config
NPLUSONE_WHITELIST = [
    {'label': 'n_plus_one', 'model': 'myapp.MyModel'}
]

# Flask-SQLAlchemy config
app.config['NPLUSONE_WHITELIST'] = [
    {'label': 'unused_eager_load', 'model': 'MyModel', 'field': 'my_field'}
]

You can whitelist models by exact name or by fnmatch patterns:

# Django config
NPLUSONE_WHITELIST = [
    {'model': 'myapp.*'}
]

To suppress notifications locally, use the ignore context manager:

from nplusone.core import signals

with signals.ignore(signals.lazy_load):
    # lazy-load rows
    # ...

License

MIT licensed. See the bundled LICENSE file for more details.

Comments
  • [django] Do not generate warning for (un)needed `prefetch_related` on `get`

    [django] Do not generate warning for (un)needed `prefetch_related` on `get`

    Hello.

    I love the idea of this app. Wanted this/to create it for years. So, thank you.

    One thing that annoy me a little is this: in the test test_many_to_one_reverse, nplusone generates a warning when selecting addresses for a user.

    But I don't think it should, as doing a prefetch_related here won't change anything: we'll still have two queries, one to get the first user, and one to get the addresses (instantly if we use prefetch_related, and only when asking for addreses when not)

    Sadly this prevents me to really use this lib because it can happens a lot as there is no optimization to do here, so nobody does a prefetch_related for a get (or .first() too?). Think of edit forms, detail views...

    The only way to change this would be to monkey patch the get method, to not take the current queryset into action for missing prefetch_related (but still do for missing select_related).

    opened by twidi 7
  • Ignore queries retrieving no data

    Ignore queries retrieving no data

    If I use a relation (prefetched via prefetch_related in django for example) in a loop but the query returned no data, nplusone think that it's a "eager load'

    It should only think this way if actual data is returned, because this causes a lot of false positive :(

    opened by twidi 6
  • AttributeError: 'RelatedManager' object has no attribute 'prefetch_cache_name'

    AttributeError: 'RelatedManager' object has no attribute 'prefetch_cache_name'

    Getting the error when using django.db.models.Prefetch to prefetch related through model. Error does not occur when nplusone is not in INSTALLED_APPS.

    Test fails on all tox envs.

    opened by dmytrokyrychuk 5
  • Ignore by URL prefixes

    Ignore by URL prefixes

    I am using django with a lot of 3d party apps, like: django-filer, django-axes, reversion, and may others. I like to be notified about potential problems with exceptions.

    I would like to ignore a lot of errors inside my admin panel. But, since I can't and do not want to fix issues with 3d party plugins, I would like to ignore some urls:

    • /admin/filer
    • and some urls of mine

    Can you consider adding such feature?

    opened by sobolevn 4
  • Usage without HTTP requests

    Usage without HTTP requests

    Hi, thanks for the awesome library.

    Right now I try to use your library for a specific project, which includes:

    1. a lot of celery tasks
    2. telegram events handling via python-telegram-bot
    3. a lot of existing unit tests

    So, I want to raise exceptions inside these "functions" if there are any violations. But the problem is that I don't have any http requests.

    So, middleware does not work here. Do you have any vision about how this library should be used in these cases? Related to https://github.com/wemake-services/wemake-django-template/issues/255

    Thanks!

    opened by sobolevn 2
  • Improve Flask APIs

    Improve Flask APIs

    This fixes a few behaviors and improves the usability of the Flask extension.

    • *Support factory pattern (init_app)
    • Correct teardown to handle missing listeners
    • Add ignore helper

    Note: this isn't fully supporting the app.extensions paradigm, but it at least will function in single-app-factory applications.

    opened by dcramer 2
  • MIDDLEWARE_CLASSES is deprecated

    MIDDLEWARE_CLASSES is deprecated

    Hello,

    The MIDDLEWARE_CLASSES setting is deprecated in Django 1.10 and the MIDDLEWARE setting takes precedence. Projects using MIDDLEWARE cannot use the nplusone middleware

    opened by canassa 2
  • [django] Change the place of `NPlusOneMiddleware`

    [django] Change the place of `NPlusOneMiddleware`

    When the NPlusOneMiddleware is at the end of the list of middlewares and a previous middleware returned a response, an error occurs in NPlusOneMiddleware.process_response because the request is not the list of listeners, as the NPlusOneMiddleware.process_request was not called (because a middleware returning a response stops the execution of the process_request methods of the following middlewares)

    So the NPlusOneMiddleware should be before every middleware that can return a response (or, at least, manage KeyError exceptions in NPlusOneMiddleware.process_response)

    opened by twidi 2
  • Fix license in package metadata

    Fix license in package metadata

    PyPI expects an OSI-specified license--not the full license text.

    Just noticed this while glancing at OSF's pyup:

    image 2018-07-30 10-16-45

    P.S. No idea why pyup thinks nplusone isn't Py3-compatible since the correct classifiers are set.

    opened by sloria 1
  • Use MiddlewareMixin to fix a deprecation warning on Django 1.10

    Use MiddlewareMixin to fix a deprecation warning on Django 1.10

    This adds MiddlewareMixin which is needed to use the Django mixin in the new MIDDLEWARE setting instead of the deprecated MIDDLEWARE_CLASSES. Here's the relevant Django documentation: https://docs.djangoproject.com/en/1.10/topics/http/middleware/#upgrading-pre-django-1-10-style-middleware

    opened by noisecapella 1
  • Handle single and multiple loads of same record

    Handle single and multiple loads of same record

    If the same record is loaded both singly (e.g. using one or first) and multiply, we can get false positives for lazy loads. To fix, we can ignore lazy loads on records that were loaded singly. Note that this logic could lead to false negatives, which isn't ideal--but it's less confusing than the status quo. See https://github.com/flyerhzm/bullet/blob/master/lib/bullet/active_record5.rb#L53-L56 for an example.

    opened by jmcarp 0
  • Cascade deletes being flagged as N+1s

    Cascade deletes being flagged as N+1s

    I don't know if this is a bug or a feature.

    Using:

    • Django 4.1.2
    • Nplusone 1.0.0

    If I delete a model that generates cascade deletions, those are displayed in the console as N+1s. For instance, in the classic Book -> Author example, if I delete an Author, the package flags N+1 like this:

    Potential n+1 query detected on Book.author

    opened by lorinkoz 0
  • Add warning notifier

    Add warning notifier

    Using warnings about n+1 query is more pythonic.

    warnings.warn() in library code if the issue is avoidable and the client application should be modified to eliminate the warning logging.warning() if there is nothing the client application can do about the situation, but the event should still be noted

    https://docs.python.org/3/howto/logging.html#when-to-use-logging

    It is also convenient to use with pytest. Because pytest creates a report on all warnings. https://docs.pytest.org/en/7.1.x/how-to/capture-warnings.html

    Sorry for my english.

    opened by AlexeyBerezhnoy 0
  • maximum recursion depth exceeded in comparison

    maximum recursion depth exceeded in comparison

    There are some cases when in the function signalify_fetch_all the decorator is called on the already decorated function queryset._fetch_all, causing a recursion depth exception if a fetch all query is invoked many times. This also happens if the query is not actually executed but its cached result is used. The issue can be reproduced through the following Django test:

    class TestTheIssue(TestCase):
        def test_n_plus_one_wrapper_error(self):
            out = []
            # MyModelA has many MyModelB, the relationship is represented adding a 
            # models.ForeignKey field on model B with related name 'my_model_bs'
            qs = MyModelA.objects.all().prefetch_related('my_model_bs').first()
            # Execute this query N times, with N any number that is bigger than the recursion limit, 
            # in my case it is sys.getrecursionlimit() = 1000 
            for i in range(2000):  
                tmp = qs.my_model_bs.all()
                out.append(tmp.first())
    
            print(out[-1])
    

    Details

    nplusone version: 1.0.0 Django version: 3.2.13

    environment: python:3.8.9-slim-buster image for Docker

    opened by CosimoRusso 1
  • Eager load, when deleting in django admin.

    Eager load, when deleting in django admin.

    I have a quite frustrating problem. After fixing a lazyload-issue I now can't delete this model. Imagine a mixin along with its admin mixin. The ModelAdminMixin overrides get_queryset with a call to select_related; otherwise I'd get a N+1 warning when displaying the related object list or details. This however causes, that I can't delete any of these objects in the admin interface. I will get an eager loading error, wenn pressing delete or the history button of the object. How can I fix that? In the following you see a boiled down implementation of the mixins and how they are used.

    class ModelMixin(models.Model):
        created_by = ForeignKey(settings.AUTH_USER_MODEL, on_delete=PROTECT, related_name="created_%(app_label)s_%(class)s")
        class Meta:
            abstract = True
    
    class ModelAdminMixin(object):
            def get_queryset(self, request):
                    return super(ModelAdminMixin, self).get_queryset(request).select_related('created_by')
    

    They are then used in a concrete model as such:

    # models.py
    #------------
    
    class ConcreteModel(ModelMixin):
        name = models.CharField(_('name'), max_length=100, blank=False, default='undefined')
    
    # admin.py
    #-----------
    
    @register(Project)
    class ConcreteModelAdmin(ModelAdminMixin, admin.ModelAdmin):
        pass
    
    opened by Nix3c3r 1
Owner
Joshua Carp
Joshua Carp
Django package to log request values such as device, IP address, user CPU time, system CPU time, No of queries, SQL time, no of cache calls, missing, setting data cache calls for a particular URL with a basic UI.

django-web-profiler's documentation: Introduction: django-web-profiler is a django profiling tool which logs, stores debug toolbar statistics and also

MicroPyramid 77 Oct 29, 2022
pdb++, a drop-in replacement for pdb (the Python debugger)

pdb++, a drop-in replacement for pdb What is it? This module is an extension of the pdb module of the standard library. It is meant to be fully compat

null 1k Dec 24, 2022
Full-screen console debugger for Python

PuDB: a console-based visual debugger for Python Its goal is to provide all the niceties of modern GUI-based debuggers in a more lightweight and keybo

Andreas Klöckner 2.6k Jan 1, 2023
Trace any Python program, anywhere!

lptrace lptrace is strace for Python programs. It lets you see in real-time what functions a Python program is running. It's particularly useful to de

Karim Hamidou 687 Nov 20, 2022
Debugging manhole for python applications.

Overview docs tests package Manhole is in-process service that will accept unix domain socket connections and present the stacktraces for all threads

Ionel Cristian Mărieș 332 Dec 7, 2022
Debugger capable of attaching to and injecting code into python processes.

DISCLAIMER: This is not an official google project, this is just something I wrote while at Google. Pyringe What this is Pyringe is a python debugger

Google 1.6k Dec 15, 2022
(OLD REPO) Line-by-line profiling for Python - Current repo ->

line_profiler and kernprof line_profiler is a module for doing line-by-line profiling of functions. kernprof is a convenient script for running either

Robert Kern 3.6k Jan 6, 2023
Monitor Memory usage of Python code

Memory Profiler This is a python module for monitoring memory consumption of a process as well as line-by-line analysis of memory consumption for pyth

Fabian Pedregosa 80 Nov 18, 2022
Sampling profiler for Python programs

py-spy: Sampling profiler for Python programs py-spy is a sampling profiler for Python programs. It lets you visualize what your Python program is spe

Ben Frederickson 9.5k Jan 8, 2023
🔥 Pyflame: A Ptracing Profiler For Python. This project is deprecated and not maintained.

Pyflame: A Ptracing Profiler For Python (This project is deprecated and not maintained.) Pyflame is a high performance profiling tool that generates f

Uber Archive 3k Jan 7, 2023
Visual profiler for Python

vprof vprof is a Python package providing rich and interactive visualizations for various Python program characteristics such as running time and memo

Nick Volynets 3.9k Jan 1, 2023
Parsing ELF and DWARF in Python

pyelftools pyelftools is a pure-Python library for parsing and analyzing ELF files and DWARF debugging information. See the User's guide for more deta

Eli Bendersky 1.6k Jan 4, 2023
pdb++, a drop-in replacement for pdb (the Python debugger)

pdb++, a drop-in replacement for pdb What is it? This module is an extension of the pdb module of the standard library. It is meant to be fully compat

null 1k Jan 2, 2023
Run-time type checker for Python

This library provides run-time type checking for functions defined with PEP 484 argument (and return) type annotations. Four principal ways to do type

Alex Grönholm 1.1k Jan 5, 2023
Graphical Python debugger which lets you easily view the values of all evaluated expressions

birdseye birdseye is a Python debugger which records the values of expressions in a function call and lets you easily view them after the function exi

Alex Hall 1.5k Dec 24, 2022
A powerful set of Python debugging tools, based on PySnooper

snoop snoop is a powerful set of Python debugging tools. It's primarily meant to be a more featureful and refined version of PySnooper. It also includ

Alex Hall 874 Jan 8, 2023
Inject code into running Python processes

pyrasite Tools for injecting arbitrary code into running Python processes. homepage: http://pyrasite.com documentation: http://pyrasite.rtfd.org downl

Luke Macken 2.7k Jan 8, 2023
Cyberbrain: Python debugging, redefined.

Cyberbrain1(电子脑) aims to free programmers from debugging.

laike9m 2.3k Jan 7, 2023
Voltron is an extensible debugger UI toolkit written in Python.

Voltron is an extensible debugger UI toolkit written in Python. It aims to improve the user experience of various debuggers (LLDB, GDB, VDB an

snare 5.9k Dec 30, 2022