A slick ORM cache with automatic granular event-driven invalidation.

Overview

Cacheops Build Status Join the chat at https://gitter.im/Suor/django-cacheops

A slick app that supports automatic or manual queryset caching and automatic granular event-driven invalidation.

It uses redis as backend for ORM cache and redis or filesystem for simple time-invalidated one.

And there is more to it:

  • decorators to cache any user function or view as a queryset or by time
  • extensions for django and jinja2 templates
  • transparent transaction support
  • dog-pile prevention mechanism
  • a couple of hacks to make django faster

Requirements

Python 3.5+, Django 2.1+ and Redis 4.0+.

Installation

Using pip:

$ pip install django-cacheops

# Or from github directly
$ pip install git+https://github.com/Suor/django-cacheops.git@master

Setup

Add cacheops to your INSTALLED_APPS.

Setup redis connection and enable caching for desired models:

CACHEOPS_REDIS = {
    'host': 'localhost', # redis-server is on same machine
    'port': 6379,        # default redis port
    'db': 1,             # SELECT non-default redis database
                         # using separate redis db or redis instance
                         # is highly recommended

    'socket_timeout': 3,   # connection timeout in seconds, optional
    'password': '...',     # optional
    'unix_socket_path': '' # replaces host and port
}

# Alternatively the redis connection can be defined using a URL:
CACHEOPS_REDIS = "redis://localhost:6379/1"
# or
CACHEOPS_REDIS = "unix://path/to/socket?db=1"
# or with password (note a colon)
CACHEOPS_REDIS = "redis://:password@localhost:6379/1"

# If you want to use sentinel, specify this variable
CACHEOPS_SENTINEL = {
    'locations': [('localhost', 26379)], # sentinel locations, required
    'service_name': 'mymaster',          # sentinel service name, required
    'socket_timeout': 0.1,               # connection timeout in seconds, optional
    'db': 0                              # redis database, default: 0
    ...                                  # everything else is passed to Sentinel()
}

# To use your own redis client class,
# should be compatible or subclass cacheops.redis.CacheopsRedis
CACHEOPS_CLIENT_CLASS = 'your.redis.ClientClass'

CACHEOPS = {
    # Automatically cache any User.objects.get() calls for 15 minutes
    # This also includes .first() and .last() calls,
    # as well as request.user or post.author access,
    # where Post.author is a foreign key to auth.User
    'auth.user': {'ops': 'get', 'timeout': 60*15},

    # Automatically cache all gets and queryset fetches
    # to other django.contrib.auth models for an hour
    'auth.*': {'ops': {'fetch', 'get'}, 'timeout': 60*60},

    # Cache all queries to Permission
    # 'all' is an alias for {'get', 'fetch', 'count', 'aggregate', 'exists'}
    'auth.permission': {'ops': 'all', 'timeout': 60*60},

    # Enable manual caching on all other models with default timeout of an hour
    # Use Post.objects.cache().get(...)
    #  or Tags.objects.filter(...).order_by(...).cache()
    # to cache particular ORM request.
    # Invalidation is still automatic
    '*.*': {'ops': (), 'timeout': 60*60},

    # And since ops is empty by default you can rewrite last line as:
    '*.*': {'timeout': 60*60},

    # NOTE: binding signals has its overhead, like preventing fast mass deletes,
    #       you might want to only register whatever you cache and dependencies.

    # Finally you can explicitely forbid even manual caching with:
    'some_app.*': None,
}

You can configure default profile setting with CACHEOPS_DEFAULTS. This way you can rewrite the config above:

CACHEOPS_DEFAULTS = {
    'timeout': 60*60
}
CACHEOPS = {
    'auth.user': {'ops': 'get', 'timeout': 60*15},
    'auth.*': {'ops': ('fetch', 'get')},
    'auth.permission': {'ops': 'all'},
    '*.*': {},
}

Using '*.*' with non-empty ops is not recommended since it will easily cache something you don't intent to or even know about like migrations tables. The better approach will be restricting by app with 'app_name.*'.

Besides ops and timeout options you can also use:

local_get: True
To cache simple gets for this model in process local memory. This is very fast, but is not invalidated in any way until process is restarted. Still could be useful for extremely rarely changed things.
cache_on_save=True | 'field_name'
To write an instance to cache upon save. Cached instance will be retrieved on .get(field_name=...) request. Setting to True causes caching by primary key.

Additionally, you can tell cacheops to degrade gracefully on redis fail with:

CACHEOPS_DEGRADE_ON_FAILURE = True

There is also a possibility to make all cacheops methods and decorators no-op, e.g. for testing:

from django.test import override_settings

@override_settings(CACHEOPS_ENABLED=False)
def test_something():
    # ...
    assert cond

Usage

Automatic caching

It's automatic you just need to set it up.

Manual caching

You can force any queryset to use cache by calling its .cache() method:

Article.objects.filter(tag=2).cache()

Here you can specify which ops should be cached for the queryset, for example, this code:

qs = Article.objects.filter(tag=2).cache(ops=['count'])
paginator = Paginator(objects, ipp)
articles = list(pager.page(page_num)) # hits database

will cache count call in Paginator but not later articles fetch. There are five possible actions - get, fetch, count, aggregate and exists. You can pass any subset of this ops to .cache() method even empty - to turn off caching. There is, however, a shortcut for the latter:

qs = Article.objects.filter(visible=True).nocache()
qs1 = qs.filter(tag=2)       # hits database
qs2 = qs.filter(category=3)  # hits it once more

It is useful when you want to disable automatic caching on particular queryset.

You can also override default timeout for particular queryset with .cache(timeout=...).

Function caching

You can cache and invalidate result of a function the same way as a queryset. Cached results of the next function will be invalidated on any Article change, addition or deletion:

from cacheops import cached_as

@cached_as(Article, timeout=120)
def article_stats():
    return {
        'tags': list(Article.objects.values('tag').annotate(Count('id')))
        'categories': list(Article.objects.values('category').annotate(Count('id')))
    }

Note that we are using list on both querysets here, it's because we don't want to cache queryset objects but their results.

Also note that if you want to filter queryset based on arguments, e.g. to make invalidation more granular, you can use a local function:

def articles_block(category, count=5):
    qs = Article.objects.filter(category=category)

    @cached_as(qs, extra=count)
    def _articles_block():
        articles = list(qs.filter(photo=True)[:count])
        if len(articles) < count:
            articles += list(qs.filter(photo=False)[:count-len(articles)])
        return articles

    return _articles_block()

We added extra here to make different keys for calls with same category but different count. Cache key will also depend on function arguments, so we could just pass count as an argument to inner function. We also omitted timeout here, so a default for the model will be used.

Another possibility is to make function cache invalidate on changes to any one of several models:

@cached_as(Article.objects.filter(public=True), Tag)
def article_stats():
    return {...}

As you can see, we can mix querysets and models here.

View caching

You can also cache and invalidate a view as a queryset. This works mostly the same way as function caching, but only path of the request parameter is used to construct cache key:

from cacheops import cached_view_as

@cached_view_as(News)
def news_index(request):
    # ...
    return HttpResponse(...)

You can pass timeout, extra and several samples the same way as to @cached_as().

Class based views can also be cached:

class NewsIndex(ListView):
    model = News

news_index = cached_view_as(News)(NewsIndex.as_view())

Invalidation

Cacheops uses both time and event-driven invalidation. The event-driven one listens on model signals and invalidates appropriate caches on Model.save(), .delete() and m2m changes.

Invalidation tries to be granular which means it won't invalidate a queryset that cannot be influenced by added/updated/deleted object judging by query conditions. Most of the time this will do what you want, if it won't you can use one of the following:

from cacheops import invalidate_obj, invalidate_model, invalidate_all

invalidate_obj(some_article)  # invalidates queries affected by some_article
invalidate_model(Article)     # invalidates all queries for model
invalidate_all()              # flush redis cache database

And last there is invalidate command:

./manage.py invalidate articles.Article.34  # same as invalidate_obj
./manage.py invalidate articles.Article     # same as invalidate_model
./manage.py invalidate articles   # invalidate all models in articles

And the one that FLUSHES cacheops redis database:

./manage.py invalidate all

Don't use that if you share redis database for both cache and something else.

Turning off and postponing invalidation

There is also a way to turn off invalidation for a while:

from cacheops import no_invalidation

with no_invalidation:
    # ... do some changes
    obj.save()

Also works as decorator:

@no_invalidation
def some_work(...):
    # ... do some changes
    obj.save()

Combined with try ... finally it could be used to postpone invalidation:

try:
    with no_invalidation:
        # ...
finally:
    invalidate_obj(...)
    # ... or
    invalidate_model(...)

Postponing invalidation can speed up batch jobs.

Mass updates

Normally qs.update(...) doesn't emit any events and thus doesn't trigger invalidation. And there is no transparent and efficient way to do that: trying to act on conditions will invalidate too much if update conditions are orthogonal to many queries conditions, and to act on specific objects we will need to fetch all of them, which QuerySet.update() users generally try to avoid.

In the case you actually want to perform the latter cacheops provides a shortcut:

qs.invalidated_update(...)

Note that all the updated objects are fetched twice, prior and post the update.

Simple time-invalidated cache

To cache result of a function call or a view for some time use:

from cacheops import cached, cached_view

@cached(timeout=number_of_seconds)
def top_articles(category):
    return ... # Some costly queries

@cached_view(timeout=number_of_seconds)
def top_articles(request, category=None):
    # Some costly queries
    return HttpResponse(...)

@cached() will generate separate entry for each combination of decorated function and its arguments. Also you can use extra same way as in @cached_as(), most useful for nested functions:

@property
def articles_json(self):
    @cached(timeout=10*60, extra=self.category_id)
    def _articles_json():
        ...
        return json.dumps(...)

    return _articles_json()

You can manually invalidate or update a result of a cached function:

top_articles.invalidate(some_category)
top_articles.key(some_category).set(new_value)

To invalidate cached view you can pass absolute uri instead of request:

top_articles.invalidate('http://example.com/page', some_category)

Cacheops also provides get/set primitives for simple cache:

from cacheops import cache

cache.set(cache_key, data, timeout=None)
cache.get(cache_key)
cache.delete(cache_key)

cache.get will raise CacheMiss if nothing is stored for given key:

from cacheops import cache, CacheMiss

try:
    result = cache.get(key)
except CacheMiss:
    ... # deal with it

File Cache

File based cache can be used the same way as simple time-invalidated one:

from cacheops import file_cache

@file_cache.cached(timeout=number_of_seconds)
def top_articles(category):
    return ... # Some costly queries

@file_cache.cached_view(timeout=number_of_seconds)
def top_articles(request, category):
    # Some costly queries
    return HttpResponse(...)

# later, on appropriate event
top_articles.invalidate(some_category)
# or
top_articles.key(some_category).set(some_value)

# primitives
file_cache.set(cache_key, data, timeout=None)
file_cache.get(cache_key)
file_cache.delete(cache_key)

It has several improvements upon django built-in file cache, both about high load. First, it's safe against concurrent writes. Second, it's invalidation is done as separate task, you'll need to call this from crontab for that to work:

/path/manage.py cleanfilecache
/path/manage.py cleanfilecache /path/to/non-default/cache/dir

Django templates integration

Cacheops provides tags to cache template fragments. They mimic @cached_as and @cached decorators, however, they require explicit naming of each fragment:

{% load cacheops %}

{% cached_as <queryset> <timeout> <fragment_name> [<extra1> <extra2> ...] %}
    ... some template code ...
{% endcached_as %}

{% cached <timeout> <fragment_name> [<extra1> <extra2> ...] %}
    ... some template code ...
{% endcached %}

You can use None for timeout in @cached_as to use it's default value for model.

To invalidate cached fragment use:

from cacheops import invalidate_fragment

invalidate_fragment(fragment_name, extra1, ...)

If you have more complex fragment caching needs, cacheops provides a helper to make your own template tags which decorate a template fragment in a way analogous to decorating a function with @cached or @cached_as. This is experimental feature for now.

To use it create myapp/templatetags/mycachetags.py and add something like this there:

from cacheops import cached_as, CacheopsLibrary

register = CacheopsLibrary()

@register.decorator_tag(takes_context=True)
def cache_menu(context, menu_name):
    from django.utils import translation
    from myapp.models import Flag, MenuItem

    request = context.get('request')
    if request and request.user.is_staff():
        # Use noop decorator to bypass caching for staff
        return lambda func: func

    return cached_as(
        # Invalidate cache if any menu item or a flag for menu changes
        MenuItem,
        Flag.objects.filter(name='menu'),
        # Vary for menu name and language, also stamp it as "menu" to be safe
        extra=("menu", menu_name, translation.get_language()),
        timeout=24 * 60 * 60
    )

@decorator_tag here creates a template tag behaving the same as returned decorator upon wrapped template fragment. Resulting template tag could be used as follows:

{% load mycachetags %}

{% cache_menu "top" %}
    ... the top menu template code ...
{% endcache_menu %}

... some template code ..

{% cache_menu "bottom" %}
    ... the bottom menu template code ...
{% endcache_menu %}

Jinja2 extension

Add cacheops.jinja2.cache to your extensions and use:

{% cached_as <queryset> [, timeout=<timeout>] [, extra=<key addition>] %}
    ... some template code ...
{% endcached_as %}

or

{% cached [timeout=<timeout>] [, extra=<key addition>] %}
    ...
{% endcached %}

Tags work the same way as corresponding decorators.

Transactions

Cacheops transparently supports transactions. This is implemented by following simple rules:

  1. Once transaction is dirty (has changes) caching turns off. The reason is that the state of database at this point is only visible to current transaction and should not affect other users and vice versa.
  2. Any invalidating calls are scheduled to run on the outer commit of transaction.
  3. Savepoints and rollbacks are also handled appropriately.

Mind that simple and file cache don't turn itself off in transactions but work as usual.

Dog-pile effect prevention

There is optional locking mechanism to prevent several threads or processes simultaneously performing same heavy task. It works with @cached_as() and querysets:

@cached_as(qs, lock=True)
def heavy_func(...):
    # ...

for item in qs.cache(lock=True):
    # ...

It is also possible to specify lock: True in CACHEOPS setting but that would probably be a waste. Locking has no overhead on cache hit though.

Multiple database support

By default cacheops considers query result is same for same query, not depending on database queried. That could be changed with db_agnostic cache profile option:

CACHEOPS = {
    'some.model': {'ops': 'get', 'db_agnostic': False, 'timeout': ...}
}

Sharing redis instance

Cacheops provides a way to share a redis instance by adding prefix to cache keys:

CACHEOPS_PREFIX = lambda query: ...
# or
CACHEOPS_PREFIX = 'some.module.cacheops_prefix'

A most common usage would probably be a prefix by host name:

# get_request() returns current request saved to threadlocal by some middleware
cacheops_prefix = lambda _: get_request().get_host()

A query object passed to callback also enables reflection on used databases and tables:

def cacheops_prefix(query):
    query.dbs    # A list of databases queried
    query.tables # A list of tables query is invalidated on

    if set(query.tables) <= HELPER_TABLES:
        return 'helper:'
    if query.tables == ['blog_post']:
        return 'blog:'

NOTE: prefix is not used in simple and file cache. This might change in future cacheops.

Using memory limit

If your cache never grows too large you may not bother. But if you do you have some options. Cacheops stores cached data along with invalidation data, so you can't just set maxmemory and let redis evict at its will. For now cacheops offers 2 imperfect strategies, which are considered experimental. So be careful and consider leaving feedback.

First strategy is configuring maxmemory-policy volatile-ttl. Invalidation data is guaranteed to have higher TTL than referenced keys. Redis however doesn't guarantee perfect TTL eviction order, it selects several keys and removes one with the least TTL, thus invalidator could be evicted before cache key it refers leaving it orphan and causing it survive next invalidation. You can reduce this chance by increasing maxmemory-samples redis config option and by reducing cache timeout.

Second strategy, probably more efficient one is adding CACHEOPS_LRU = True to your settings and then using maxmemory-policy volatile-lru. However, this makes invalidation structures persistent, they are still removed on associated events, but in absence of them can clutter redis database.

Keeping stats

Cacheops provides cache_read and cache_invalidated signals for you to keep track.

Cache read signal is emitted immediately after each cache lookup. Passed arguments are: sender - model class if queryset cache is fetched, func - decorated function and hit - fetch success as boolean value.

Here is a simple stats implementation:

from cacheops.signals import cache_read
from statsd.defaults.django import statsd

def stats_collector(sender, func, hit, **kwargs):
    event = 'hit' if hit else 'miss'
    statsd.incr('cacheops.%s' % event)

cache_read.connect(stats_collector)

Cache invalidation signal is emitted after object, model or global invalidation passing sender and obj_dict args. Note that during normal operation cacheops only uses object invalidation, calling it once for each model create/delete and twice for update: passing old and new object dictionary.

CAVEATS

  1. Conditions other than __exact, __in and __isnull=True don't make invalidation more granular.
  2. Conditions on TextFields, FileFields and BinaryFields don't make it either. One should not test on their equality anyway. See CACHEOPS_SKIP_FIELDS though.
  3. Update of "selected_related" object does not invalidate cache for queryset. Use .prefetch_related() instead.
  4. Mass updates don't trigger invalidation by default. But see .invalidated_update().
  5. Sliced queries are invalidated as non-sliced ones.
  6. Doesn't work with .raw() and other sql queries.
  7. Conditions on subqueries don't affect invalidation.
  8. Doesn't work right with multi-table inheritance.

Here 1, 2, 3, 5 are part of the design compromise, trying to solve them will make things complicated and slow. 7 can be implemented if needed, but it's probably counter-productive since one can just break queries into simpler ones, which cache better. 4 is a deliberate choice, making it "right" will flush cache too much when update conditions are orthogonal to most queries conditions, see, however, .invalidated_update(). 8 is postponed until it will gain more interest or a champion willing to implement it emerges.

All unsupported things could still be used easily enough with the help of @cached_as().

Performance tips

Here come some performance tips to make cacheops and Django ORM faster.

  1. When you use cache you pickle and unpickle lots of django model instances, which could be slow. You can optimize django models serialization with django-pickling.

  2. Constructing querysets is rather slow in django, mainly because most of QuerySet methods clone self, then change it and return the clone. Original queryset is usually thrown away. Cacheops adds .inplace() method, which makes queryset mutating, preventing useless cloning:

    items = Item.objects.inplace().filter(category=12).order_by('-date')[:20]
    

    You can revert queryset to cloning state using .cloning() call.

    Note that this is a micro-optimization technique. Using it is only desirable in the hottest places, not everywhere.

  3. Use template fragment caching when possible, it's way more fast because you don't need to generate anything. Also pickling/unpickling a string is much faster than a list of model instances.

  4. Run separate redis instance for cache with disabled persistence. You can manually call SAVE or BGSAVE to stay hot upon server restart.

  5. If you filter queryset on many different or complex conditions cache could degrade performance (comparing to uncached db calls) in consequence of frequent cache misses. Disable cache in such cases entirely or on some heuristics which detect if this request would be probably hit. E.g. enable cache if only some primary fields are used in filter.

    Caching querysets with large amount of filters also slows down all subsequent invalidation on that model. You can disable caching if more than some amount of fields is used in filter simultaneously.

Writing a test

Writing a test for an issue you are experiencing can speed up its resolution a lot. Here is how you do that. I suppose you have some application code causing it.

  1. Make a fork.
  2. Install all from requirements-test.txt.
  3. Ensure you can run tests with ./run_tests.py.
  4. Copy relevant models code to tests/models.py.
  5. Go to tests/tests.py and paste code causing exception to IssueTests.test_{issue_number}.
  6. Execute ./run_tests.py {issue_number} and see it failing.
  7. Cut down model and test code until error disappears and make a step back.
  8. Commit changes and make a pull request.

TODO

  • faster .get() handling for simple cases such as get by pk/id, with simple key calculation
  • integrate previous one with prefetch_related()
  • shard cache between multiple redises
  • respect subqueries?
  • respect headers in @cached_view*?
  • group invalidate_obj() calls?
  • a postpone invalidation context manager/decorator?
  • fast mode: store cache in local memory, but check in with redis if it's valid
  • an interface for complex fields to extract exact on parts or transforms: ArrayField.len => field__len=?, ArrayField[0] => field__0=?, JSONField['some_key'] => field__some_key=?
  • custom cache eviction strategy in lua
  • cache a string directly (no pickle) for direct serving (custom key function?)
Comments
  • Support for django ATOMIC_REQUESTS

    Support for django ATOMIC_REQUESTS

    When using Django's ATOMIC_REQUESTS django-cacheops silently has no effect since caching is disabled inside transactions (see https://github.com/Suor/django-cacheops/pull/171).

    Caching inside transactions is tricky, since we don't know if the view we have from the database will be commited at all, so it simply gets skipped. This pull request takes that strategy a little further, only skipping caching if a current transaction has changes at all. In a typical web app most requests are read only so with this PR they can profit from django-cacheops when Django's ATOMIC_REQUESTS is used.

    opened by ihucos 34
  • Invalidation problem

    Invalidation problem

    Заметил такое странное поведение при инвалидации кэша.

    1. Стартую приложение
    2. Делаю запрос к вьюхе, в которой есть обращение к БД b = Brand.objects.get(slug='slug_key')
    3. redis-cli показывает: redis 127.0.0.1:6379[2]> smembers "schemes:catalog.brand"
      1. "slug"
      2. "id"
    4. Делаю flushdb
    5. Повторяю запрос п. 2
    6. Ключи для инвалидации не создаются redis 127.0.0.1:6379[2]> smembers "schemes:catalog.brand" (empty list or set)

    Хотя в кэше они есть:

    redis 127.0.0.1:6379[2]> keys brand

    1. "conj:catalog.brand:id=1"
    2. "conj:catalog.brand:slug=slug_key"

    Что можно сделать, чтобы запросы по slug=*** проходили инвалидацию?

    opened by MikeVL 32
  • cacheops fails at runtime on 4th version

    cacheops fails at runtime on 4th version

    Hello. On cacheops 3.2.1 there is no runtime errors. But after I upgrade to 4th version (I tried all four subversions), I get this error at runtime:

    Unhandled exception in thread started by <_pydev_bundle.pydev_monkey._NewThreadStartupWithTrace object at 0x1114cb898>
    Traceback (most recent call last):
      File "/Users/***/lib/python3.6/site-packages/funcy/calc.py", line 41, in wrapper
        return memory[key]
    KeyError: (<class 'moderator.models.moderator.Moderator'>,)
    
    During handling of the above exception, another exception occurred:
    
    
    Traceback (most recent call last):
      File "/Applications/PyCharm.app/Contents/helpers/pydev/_pydev_bundle/pydev_monkey.py", line 589, in __call__
        return self.original_func(*self.args, **self.kwargs)
      ........
      File "/Users/***/lib/python3.6/site-packages/django/core/checks/urls.py", line 13, in check_url_config
        return check_resolver(resolver)
      File "/Users/***/lib/python3.6/site-packages/django/core/checks/urls.py", line 23, in check_resolver
        return check_method()
      ........
      File "/Users/***/lib/python3.6/site-packages/django/urls/resolvers.py", line 529, in urlconf_module
        return import_module(self.urlconf_name)
      File "/Users/***/lib/python3.6/importlib/__init__.py", line 126, in import_module
        return _bootstrap._gcd_import(name[level:], package, level)
      ........
      File "/Users/***/moderator/models/moderator.py", line 10, in <module>
        class Moderator(auth_models.AbstractBaseUser, auth_models.PermissionsMixin):
      File "/Users/***/lib/python3.6/site-packages/django/db/models/base.py", line 152, in __new__
        new_class.add_to_class(obj_name, obj)
      File "/Users/***/lib/python3.6/site-packages/django/db/models/base.py", line 315, in add_to_class
        value.contribute_to_class(cls, name)
      File "/Users/***/lib/python3.6/site-packages/cacheops/query.py", line 422, in contribute_to_class
        if cls.__module__ != '__fake__' and family_has_profile(cls):
      File "/Users/***/lib/python3.6/site-packages/funcy/calc.py", line 44, in wrapper
        value = memory[key] = func(*args, **kwargs)
      File "/Users/***/lib/python3.6/site-packages/cacheops/utils.py", line 38, in family_has_profile
        return any(model_profile, model_family(cls))
      File "/Users/***/lib/python3.6/site-packages/cacheops/utils.py", line 33, in model_family
        return class_tree(model._meta.concrete_model)
      File "/Users/***/lib/python3.6/site-packages/cacheops/utils.py", line 29, in class_tree
        return [cls] + lmapcat(class_tree, cls.__subclasses__())
    
    AttributeError: 'NoneType' object has no attribute '__subclasses__'
    

    Settings are very easy:

    CACHEOPS_ENABLED = True
    CACHEOPS_DEGRADE_ON_FAILURE = True
    CACHEOPS_REDIS = CACHES['default']['LOCATION']
    CACHEOPS_DEFAULTS = {
        'timeout': 60 * 60,
        'db_agnostic': False
    }
    CACHEOPS = {
        '*.*': {'ops': 'all'},  # only manual caching
    }
    

    Have no idea what's wrong here. Anybody knows?

    opened by thakryptex 28
  • First implementation of dog-pile effect avoidance via python-redis-lock additions

    First implementation of dog-pile effect avoidance via python-redis-lock additions

    Hi @Suor,

    here is a patch to enable dog-pile avoidance in cacheops.

    Could you please review it and see if I didn't miss anything ?

    This patched version has been running for hours on my 1flow development machine without problems.

    I will deploy it on http://1flow.io/ at next major release.

    I patched simple & query, and didn't find any other obvious locations to patch.

    Your feedback is appreciated.

    Regards,

    opened by Karmak23 27
  • Custom cache keys or support for multiple schemas in a single database

    Custom cache keys or support for multiple schemas in a single database

    Postgresql support multiple schemas. The schemas can have same table names. Two rows from similar tables from 2 different schemas might cause a race condition and return cached object from the WRONG schema.

    It can be solved by making it possible to generate the cache key at runtime. We discussed this over at django-cachalot and came up with a solution. May be something similar could be useful done here as well? https://github.com/BertrandBordage/django-cachalot/issues/1

    discussion looking for more interest 
    opened by owais 26
  • Use a local cache during atomic transactions

    Use a local cache during atomic transactions

    In these commits, I've added support to cacheops for django's transaction.atomic context manager. My co-workers and myself have run into situations where rolled back data was making it into the cache. I noticed this was on the readme's todo list, so I hope these changes are something you like for cacheops.

    In this new functionality, a thread specific local cache is setup on atomic. Keys that are usually stored in redis are instead held in a local cache . Since atomic relates to database transactions, this functionality only affects querysets and views.

    When atomic commits, it moves local keys to redis and then clears the local cache. When atomic rolls back, it clears the local cache. Inside atomic blocks, cacheops will check the local cache before redis. Local cache misses cause cacheops to look to redis as before these changes.

    This new functionality also deals with savepoints. Upon entering a nested atomic block, a local cache context is setup. When atomic commits a savepoint, cacheops moves keys up to the parent's context. This parent can be another save point or the outer transaction. When atomic rolls back a savepoint, cacheops will discard the context.

    To support these different contexts, the local cache is a collections.ChainMap instance. Python 3.3 introduced collections.ChainMap. A backport is available for earlier versions. I have added this backport as a cacheops package dependency.

    I have added a setting for this functionality called CACHEOPS_RESPECT_ATOMIC, which defaults to false. Because of the dependency on transaction.Atomic, cacheops ignores this setting unless running django 1.6+.

    I would like to hear any feedback you have on this pull request. Some of my specific questions are:

    • Do you think CACHEOPS_RESPECT_ATOMIC should be true by default instead of false?
    • My co-workers and myself are not using the Simple time-invalidated cache or File Cache. My understanding that these are not related to database transactions. With this assumption, I did not add transaction local caching code to these functions. Is this correct?
    • While developing these changes, I was running the full run_tests.py suite. After passing these tests, I moved the functionality behind the CACHEOPS_RESPECT_ATOMIC setting. I wrote some tests of my own, with the @override_settings decorator to turn this setting on. I also have the BasicTests being rerun with this setting on.
      • Do you think more of the previous tests should be rerun with the new setting on?
      • Would you review the tests I have added? I'm not sure if they can be more simple or organized better.
    • Should I rebase the commits in this pull request to a single commit?
    opened by jhillacre 20
  • Fix stamp_fields utility function

    Fix stamp_fields utility function

    In stamp_fields utility function we create a stamp from the list of (f.name, f.attname, f.db_column, f.__class__) for all model fields returned from meta:

    stamp = str(sorted((f.name, f.attname, f.db_column, f.__class__) for f in model._meta.fields))
    

    The last element is the actual class type of the field and not a class name. This will fail, as before being stringified the list is sorted, and the classes cannot be compared. We should use a class name instead.

    Also, if db_column was not provided in model definition, f.db_column will be None, which is not comparable as well.

    Practically, a model may have two fields with the same name and attname if one field was contributed to model as private. model._meta.fields return both local and private fields.

    opened by lokhman 19
  • Option to temporary disable cacheops

    Option to temporary disable cacheops

    I want to disable the cacheops caching in the test configuration. I found in the git history the CACHEOPS_FAKE=True, but I note it was removed. I tried setting CACHEOPS to the different values:

    CACHEOPS = None
    CACHEOPS = {}
    CACHEOPS = {'*.*': None}
    CACHEOPS = {'*.*': {'timeout': 0}}
    

    but then I got exceptions. Why the CACHEOPS_FAKE was removed and how to disable the cacheops now?

    opened by generalov 18
  • Question on prefetch_related

    Question on prefetch_related

    Hi! Thanks for creating this great solution.

    I use prefetch_related for a few areas in my app, but I noticed that it is not supported. What will happen if I use prefetch_related with CacheOps? Will the separate fetch results just not cache?

    Also, Is select_related supported?

    Many thanks for the clarification.

    -Lyle

    opened by lylepratt 18
  • Yet another threading issue with m2m (v2.2.1)

    Yet another threading issue with m2m (v2.2.1)

    Hello,

    I'm experiencing a serious problems with m2m fields when cacheops is enabled. I belive the problem is somewhere in threading support because it does not appear with development server (and as a result tests passes but features gets broken on production).

    See the code:

    model (simplified):

    class WebsiteCategory(models.Model):
    
        code = models.CharField(
            verbose_name=_(u'category'),
            max_length=80,
            choices=WEBSITE_CATEGORIES,
            unique=True,
        )
    
        class Meta:
            ordering = ['code']
            verbose_name = _(u"category")
            verbose_name_plural = _(u"categories")
    
        def __unicode__(self):
            return self.get_code_display()
    
    class Website(VersionMixin):
        url = models.URLField(
            verbose_name=_(u"URL"),
            max_length=255)
    
        description = models.CharField(
            blank=True,
            default="",
            max_length=500)
    
        categories = models.ManyToManyField(
            'categories.WebsiteCategory',
            related_name='websites')
    

    settings.py:

    ########## CACHEOPS CONFIGURATION
    CACHEOPS_DEFAULTS = {
        'timeout': 60 * 60 * 24  # 24 hours
    }
    
    CACHEOPS = {
        # ... skipped unrelated apps
    
        # Categories
        'categories.websitecategory': {'ops': 'all'},
    
        # Websites
        'websites.website': {'ops': 'all'},
    
        ### THIRD PARTY APPS ###
        'auth.*': {'ops': 'all'},
        'sites.*': {'ops': 'all'},
        'authtoken.*': {'ops': 'all'},
        'notification.*': {'ops': 'all'},
        'oauth2.*': {'ops': 'all'},
    }
    ########## END CACHEOPS CONFIGURATION
    

    Here is how problem appears to the end user: User fills the website creation form by providing url, description and selecting few categories and hits "SAVE" button. Website is now in cache and in db. User tries to update created object and there is a failure – he can update everything but not a categories m2m field.

    So I've tried to do a couple of experiments.

    I run ./manage.py shell_plus to get into console, loaded a Website object there and changed categories. Refreshed page on a website – categories were updated. NEXT PART IS STRANGE. Then I've immediately tried to update categories using website – categories field were updated on db lvl, but reading still returns old m2m set on a web interface (console returns fresh results). Subsequent attempts to update object using web interface does not hit db any more. At this point I've tried to invalite_obj, invalite_model from console, but web interface still output old categories set. I belive this has nothing to do with a browser cache, nor with UI.

    I'm mostly positive it has something to do with threading support and has nothing do to with UI (In my case it's DRF API, so I've tried to update object using same serializer inside a console and it works fine, same as if I update an object directly without serializer layer).

    Once I set CACHEOPS_FAKE to True all problems with m2m gone away. So it's for sure something related to cacheops.

    @Suor can you please give any directions how may I find out the source of this problem? Thank you very much for your work!

    opened by pySilver 16
  • 3rd party apps breaking

    3rd party apps breaking

    Been using this app called feedjack (http://www.feedjack.org/) to grab feeds

    Traceback (most recent call last):
      File "feedjack_update.py", line 505, in <module>
        main()
      File "feedjack_update.py", line 488, in main
        for feed in models.Feed.objects.filter(is_active=True):
      File "/usr/local/lib/python2.6/dist-packages/django/db/models/query.py", line 107, in _result_iter
        self._fill_cache()
      File "/usr/local/lib/python2.6/dist-packages/django/db/models/query.py", line 772, in _fill_cache
        self._result_cache.append(self._iter.next())
      File "/usr/local/lib/python2.6/dist-packages/cacheops/query.py", line 301, in iterator
        cache_this = self._cacheprofile is not None and 'fetch' in self._cacheops
    AttributeError: 'QuerySet' object has no attribute '_cacheprofile'
    

    I'm not caching any of the feed jack models

    CACHEOPS = {
        'pad.padportfoliothumb': ('all', 60*60),
        'pad.*':  ('just_enable', 60*60),
    }
    
    waiting for feedback 
    opened by kevinpostal 15
  • Error when trying to cache an abstract model

    Error when trying to cache an abstract model

    Hello!

    Found an error when using a bunch of libraries:

    • django_pg_bulk_update (requires django_pg_returning which contains the abstract class django_pg_returning.models.UpdateReturningModel)
    • django-cacheops

    When trying to use mass recording (django_pg_bulk_update.bulk_update_or_create), an attempt to cache appears UpdateReturningModel (https://github.com/M1ha-Shvn/django-pg-returning/blob/master/src/django_pg_returning/models.py#L8) model, get its application in the model_profile function (https://github.com/Suor/django-cacheops/blob/master/cacheops/conf.py#L105). This model does not have an application - which causes an error:

    Traceback (most recent call last):
       File "/usr/local/lib/python3.9/site-packages/funcy/calc.py", line 59, in wrapper
         return memory[key]
       KeyError: (<class 'django_pg_returning.models.UpdateReturningModel'>,)
       During handling of the above exception, another exception occurred:
       ...
       File "/usr/local/lib/python3.9/site-packages/django_pg_bulk_update/query.py", line 1018, in bulk_update_or_create
         batched_result = batched_operation(batch_func, values,
       File "/usr/local/lib/python3.9/site-packages/django_pg_bulk_update/utils.py", line 175, in batched_operation
         r = handler(*args, **kwargs)
       File "/usr/local/lib/python3.9/site-packages/django_pg_bulk_update/query.py", line 948, in _insert_on_conflict_no_validation
         return _execute_update_query(model, conn, sql, params, ret_fds)
       File "/usr/local/lib/python3.9/site-packages/django_pg_bulk_update/query.py", line 553, in _execute_update_query
         from django_pg_returning import ReturningQuerySet
       File "/usr/local/lib/python3.9/site-packages/django_pg_returning/__init__.py", line 3, in <module>
         from .models import * # noqa F401, F403
       File "/usr/local/lib/python3.9/site-packages/django_pg_returning/models.py", line 8, in <module>
         class UpdateReturningModel(models.Model):
       File "/usr/local/lib/python3.9/site-packages/django/db/models/base.py", line 161, in __new__
         new_class.add_to_class(obj_name, obj)
       File "/usr/local/lib/python3.9/site-packages/django/db/models/base.py", line 326, in add_to_class
         value.contribute_to_class(cls, name)
       File "/usr/local/lib/python3.9/site-packages/cacheops/query.py", line 426, in contribute_to_class
         if cls.__module__ != '__fake__' and family_has_profile(cls):
       File "/usr/local/lib/python3.9/site-packages/funcy/calc.py", line 62, in wrapper
         value = memory[key] = func(*args, **kwargs)
       File "/usr/local/lib/python3.9/site-packages/cacheops/utils.py", line 32, in family_has_profile
         return any(model_profile, model_family(cls))
       File "/usr/local/lib/python3.9/site-packages/funcy/colls.py", line 207, in any
         return _any(xmap(pred, seq))
       File "/usr/local/lib/python3.9/site-packages/cacheops/conf.py", line 105, in model_profile
         app = model._meta.app_label.lower()
    AttributeError: 'NoneType' object has no attribute 'lower'
    

    Package versions:

    django-pg-returning==2.0.0
    django-pg-bulk-update==3.7.0
    django-cacheops==6.0
    
    opened by coldcloudgold 2
  • Limit caching to certain DB to support Postgres followers

    Limit caching to certain DB to support Postgres followers

    Problem

    We are having this django setup with a Postgres master and a follower that we are using for read only operations, so something like this:

    READ_ONLY_DB = 'follower'
    DATABASES = {
        'default': dj_database_url.config(default='postgres://postgres:abcdEF123456@...:5432/master'),
        'follower': dj_database_url.config(default='postgres://postgres:abcdEF123456@...:5432/follower')
    }
    

    If we quzery now the master we just do this:

    MyModel.objects.get(pk=my_model_id)
    

    But whenever we want to query the follower we need to make sure that cacheops isn't caching anything.

    MyModel.objects.using(settings.READ_ONLY_DB).nocache().get(pk=my_model_id)
    

    If we forget nocache(), it caches the entry and if we maybe half an hour later would do:

    my_model = MyModel.objects.get(pk=my_model_id)
    my_model.name = 'somethingelse'
    my_model.save()
    

    it will pull the entry from the cache (which still "knows" that it came from the save) and then the save() method will throw this exception:

    Cannot assign "<MyModel: ...>": the current database router prevents this relation.
    

    Obviously the follower doesn't allow write operations ... This wouldn't be too problematic as one just needs make sure that one isn't forgettting the nocache() after using(...). The real fun starts when you chain querysets. These chained querysets are actual copies of the original queryset and seem to "lose" this nocache() info while the using(...) info is preserved. So after every filter(...) or values(...) i need to call nocache(), especailly when these filter or values options query related objects. I am not even sure where it needs to be added always but we had a lot of bugs with this approach and finding them is HARD due to the reason that the error is thrown somewhere else in the code and one always needs to check where one added caching before or maybe added a new feature that might use it. :(

    Solution:

    Would it be possible that one can define a whitelist or blacklist, so cacheops doesn't cache anything of our 'follower. Then we could drop all thesenocache()` calls and our code would become much simpler and less error prone.

    Thank you very much for consideration!

    opened by chickahoona 1
  • Remove not needed six dependency

    Remove not needed six dependency

    The six package is not required anymore, python2 support was removed in https://github.com/Suor/django-cacheops/commit/19c36b2bb51aa05188ee8f6ad23cac1f850d7c66

    opened by danigm 0
  • cache response of DRF view?

    cache response of DRF view?

    Hi, I'm wondering if there's any convenient way to cache the response of a Django Rest framework endpoint. For what I've seen, there's no easy way.

    I did this to use cacheops to cache the response, invalidating it if the model Project was created/changed/deleted:

    def list(self, request, *args, **kwargs):
            queryset = self.filter_queryset(self.get_queryset())
            page = self.paginate_queryset(queryset)
    
            @cached_as(
                Project,
                extra=lambda: (
                    request.user.user_type,
                    request.user.country,
                    request.user.currency_token,
                    request.user.company.is_intracommunity_operator if request.user.company else False,
                ),
            )
            def _list():
                if page is not None:
                    serializer = self.get_serializer(page, many=True)
                    return serializer.data
                serializer = self.get_serializer(queryset, many=True)
                return serializer.data
            data = _list()
            if page is not None:
                return self.get_paginated_response(data)
            else:
                return Response(data)
    

    I needed a fast solution so this works for me, at this time. But I can't keep doing this all over the project if I need do cache more endpoints.

    opened by palvarezcordoba 0
  • Question: Support for read-only nodes

    Question: Support for read-only nodes

    I have a redis server with 2 nodes (1 main and 1 read-only worker) and I'd like to better understand how I can configure cacheops to make use of this worker read-only node. Going through the docs I wasn't able to find anything explicit about it, so could anyone help me on how to configure the lib to use this infrastructure, please?

    opened by studiojms 3
Owner
Alexander Schepanovski
Alexander Schepanovski
An ORM cache for Django.

Django ORMCache A cache manager mixin that provides some caching of objects for the ORM. Installation / Setup / Usage TODO Testing Run the tests with:

Educreations, Inc 15 Nov 27, 2022
Automatic Flask cache configuration on Heroku.

flask-heroku-cacheify Automatic Flask cache configuration on Heroku. Purpose Configuring your cache on Heroku can be a time sink. There are lots of di

Randall Degges 39 Jun 5, 2022
Python disk-backed cache (Django-compatible). Faster than Redis and Memcached. Pure-Python.

DiskCache is an Apache2 licensed disk and file backed cache library, written in pure-Python, and compatible with Django.

Grant Jenks 1.7k Jan 5, 2023
A Redis cache backend for django

Redis Django Cache Backend A Redis cache backend for Django Docs can be found at http://django-redis-cache.readthedocs.org/en/latest/. Changelog 3.0.0

Sean Bleier 1k Dec 15, 2022
johnny cache django caching framework

Johnny Cache is a caching framework for django applications. It works with the django caching abstraction, but was developed specifically with the use

Jason Moiron 304 Nov 7, 2022
RecRoom Library Cache Tool

RecRoom Library Cache Tool A handy tool to deal with the Library cache file. Features Parse Library cache Remove Library cache Parsing The script pars

Jesse 5 Jul 9, 2022
Peerix is a peer-to-peer binary cache for nix derivations

Peerix Peerix is a peer-to-peer binary cache for nix derivations. Every participating node can pull derivations from each other instances' respective

null 92 Dec 13, 2022
Robust, highly tunable and easy-to-integrate in-memory cache solution written in pure Python, with no dependencies.

Omoide Cache Caching doesn't need to be hard anymore. With just a few lines of code Omoide Cache will instantly bring your Python services to the next

Leo Ertuna 2 Aug 14, 2022
A slick ORM cache with automatic granular event-driven invalidation.

Cacheops A slick app that supports automatic or manual queryset caching and automatic granular event-driven invalidation. It uses redis as backend for

Alexander Schepanovski 1.7k Jan 3, 2023
Automatic caching and invalidation for Django models through the ORM.

Cache Machine Cache Machine provides automatic caching and invalidation for Django models through the ORM. For full docs, see https://cache-machine.re

null 846 Nov 26, 2022
Automatic caching and invalidation for Django models through the ORM.

Cache Machine Cache Machine provides automatic caching and invalidation for Django models through the ORM. For full docs, see https://cache-machine.re

null 846 Nov 26, 2022
Fully Automated YouTube Channel ▶️with Added Extra Features.

Fully Automated Youtube Channel ▒█▀▀█ █▀▀█ ▀▀█▀▀ ▀▀█▀▀ █░░█ █▀▀▄ █▀▀ █▀▀█ ▒█▀▀▄ █░░█ ░░█░░ ░▒█░░ █░░█ █▀▀▄ █▀▀ █▄▄▀ ▒█▄▄█ ▀▀▀▀ ░░▀░░ ░▒█░░ ░▀▀▀ ▀▀▀░

sam-sepiol 249 Jan 2, 2023
Django package to log request values such as device, IP address, user CPU time, system CPU time, No of queries, SQL time, no of cache calls, missing, setting data cache calls for a particular URL with a basic UI.

django-web-profiler's documentation: Introduction: django-web-profiler is a django profiling tool which logs, stores debug toolbar statistics and also

MicroPyramid 77 Oct 29, 2022
Jira-cache - Jira cache with python

Direct queries to Jira have two issues: they are sloooooow many queries are impo

John Scott 6 Oct 8, 2022
An ORM cache for Django.

Django ORMCache A cache manager mixin that provides some caching of objects for the ORM. Installation / Setup / Usage TODO Testing Run the tests with:

Educreations, Inc 15 Nov 27, 2022
GINO Is Not ORM - a Python asyncio ORM on SQLAlchemy core.

GINO - GINO Is Not ORM - is a lightweight asynchronous ORM built on top of SQLAlchemy core for Python asyncio. GINO 1.0 supports only PostgreSQL with

GINO Community 2.5k Dec 27, 2022
Tortoise ORM is an easy-to-use asyncio ORM inspired by Django.

Tortoise ORM was build with relations in mind and admiration for the excellent and popular Django ORM. It's engraved in it's design that you are working not with just tables, you work with relational data.

Tortoise 3.3k Jan 7, 2023
Automatic Flask cache configuration on Heroku.

flask-heroku-cacheify Automatic Flask cache configuration on Heroku. Purpose Configuring your cache on Heroku can be a time sink. There are lots of di

Randall Degges 39 Jun 5, 2022
Graphical interface to control granular sound synthesis.

Granular sound synthesis interface SoundGrain is a graphical interface where users can draw and edit trajectories to control granular sound synthesis

Olivier Bélanger 122 Dec 10, 2022
Implementation of "Learning Multi-Granular Hypergraphs for Video-Based Person Re-Identification"

hypergraph_reid Implementation of "Learning Multi-Granular Hypergraphs for Video-Based Person Re-Identification" If you find this help your research,

null 62 Dec 21, 2022