❄️ A flake8 plugin to help you write better list/set/dict comprehensions.

Adam Johnson

Last update: Dec 23, 2022

Related tags

Linters & Style Checkers flake8

Overview

flake8-comprehensions

https://img.shields.io/badge/code%20style-black-000000.svg?style=for-the-badge

A flake8 plugin that helps you write better list/set/dict comprehensions.

Requirements

Python 3.6 to 3.9 supported.

Installation

First, install with pip:

python -m pip install flake8-comprehensions

Second, check that flake8 lists the plugin in its version line:

$ flake8 --version
3.7.8 (flake8-comprehensions: 3.0.0, mccabe: 0.6.1, pycodestyle: 2.5.0, pyflakes: 2.1.1) CPython 3.8.0 on Linux

Third, add the C4 prefix to your select list. For example, if you have your configuration in setup.cfg:

[flake8]
select = E,F,W,C4

Linting a Django project? Check out my book Speed Up Your Django Tests which covers loads of best practices so you can write faster, more accurate tests.

Rules

C400-402: Unnecessary generator - rewrite as a `<list/set/dict>` comprehension.

It's unnecessary to use list, set, or dict around a generator expression, since there are equivalent comprehensions for these types. For example:

Rewrite list(f(x) for x in foo) as [f(x) for x in foo]
Rewrite set(f(x) for x in foo) as {f(x) for x in foo}
Rewrite dict((x, f(x)) for x in foo) as {x: f(x) for x in foo}

C403-404: Unnecessary list comprehension - rewrite as a `<set/dict>` comprehension.

It's unnecessary to use a list comprehension inside a call to set or dict, since there are equivalent comprehensions for these types. For example:

Rewrite set([f(x) for x in foo]) as {f(x) for x in foo}
Rewrite dict([(x, f(x)) for x in foo]) as {x: f(x) for x in foo}

C405-406: Unnecessary `<list/tuple>` literal - rewrite as a `<set/dict>` literal.

It's unnecessary to use a list or tuple literal within a call to set or dict. For example:

Rewrite set([1, 2]) as {1, 2}
Rewrite set((1, 2)) as {1, 2}
Rewrite set([]) as set()
Rewrite dict([(1, 2)]) as {1: 2}
Rewrite dict(((1, 2),)) as {1: 2}
Rewrite dict([]) as {}

C407: Unnecessary `<dict/list>` comprehension - `<builtin>` can take a generator

It's unnecessary to pass a list comprehension to some builtins that can take generators instead. For example:

Rewrite sum([x ** 2 for x in range(10)]) as sum(x ** 2 for x in range(10))
Rewrite all([foo.bar for foo in foos]) as all(foo.bar for foo in foos)
Rewrite filter(lambda x: x % 2 == 0, [x ** 3 for x in range(10)]) as filter(lambda x: x % 2 == 0, (x ** 3 for x in range(10)))

The list of builtins that are checked for are:

all
any
enumerate
filter
frozenset
map
max
min
sorted
sum
tuple

C408: Unnecessary `<dict/list/tuple>` call - rewrite as a literal.

It's slower to call e.g. dict() than using the empty literal, because the name dict must be looked up in the global scope in case it has been rebound. Same for the other two basic types here. For example:

Rewrite dict() as {}
Rewrite dict(a=1, b=2) as {"a": 1, "b": 2}
Rewrite list() as []
Rewrite tuple() as ()

C409-410: Unnecessary `<list/tuple>` passed to `<list/tuple>`() - (remove the outer call to <list/tuple>``()/rewrite as a ``<list/tuple> literal).

It's unnecessary to use a list or tuple literal within a call to list or tuple, since there is literal syntax for these types. For example:

Rewrite tuple([1, 2]) as (1, 2)
Rewrite tuple((1, 2)) as (1, 2)
Rewrite tuple([]) as ()
Rewrite list([1, 2]) as [1, 2]
Rewrite list((1, 2)) as [1, 2]
Rewrite list([]) as []

C411: Unnecessary list call - remove the outer call to list().

It's unnecessary to use a list around a list comprehension, since it is equivalent without it. For example:

Rewrite list([f(x) for x in foo]) as [f(x) for x in foo]

C412: Unnecessary `<dict/list/set>` comprehension - 'in' can take a generator.

It's unnecessary to pass a dict/list/set comprehension to 'in', as it can take a generator instead. For example:

Rewrite y in [f(x) for x in foo] as y in (f(x) for x in foo)
Rewrite y in {x ** 2 for x in foo} as y in (x ** 2 for x in foo)

C413: Unnecessary `<list/reversed>` call around sorted().

It's unnecessary to use list() around sorted() as it already returns a list. It is also unnecessary to use reversed() around sorted() as the latter has a reverse argument. For example:

Rewrite list(sorted([2, 3, 1])) as sorted([2, 3, 1])
Rewrite reversed(sorted([2, 3, 1])) as sorted([2, 3, 1], reverse=True)
Rewrite reversed(sorted([2, 3, 1], reverse=True)) as sorted([2, 3, 1])

C414: Unnecessary `<list/reversed/set/sorted/tuple>` call within `<list/set/sorted/tuple>`().

It's unnecessary to double-cast or double-process iterables by wrapping the listed functions within list/set/sorted/tuple. For example:

Rewrite list(list(iterable)) as list(iterable)
Rewrite list(tuple(iterable)) as list(iterable)
Rewrite tuple(list(iterable)) as tuple(iterable)
Rewrite tuple(tuple(iterable)) as tuple(iterable)
Rewrite set(set(iterable)) as set(iterable)
Rewrite set(list(iterable)) as set(iterable)
Rewrite set(tuple(iterable)) as set(iterable)
Rewrite set(sorted(iterable)) as set(iterable)
Rewrite set(reversed(iterable)) as set(iterable)
Rewrite sorted(list(iterable)) as sorted(iterable)
Rewrite sorted(tuple(iterable)) as sorted(iterable)
Rewrite sorted(sorted(iterable)) as sorted(iterable)
Rewrite sorted(reversed(iterable)) as sorted(iterable)

C415: Unnecessary subscript reversal of iterable within `<reversed/set/sorted>`().

It's unnecessary to reverse the order of an iterable when passing it into one of the listed functions will change the order again. For example:

Rewrite set(iterable[::-1]) as set(iterable)
Rewrite sorted(iterable[::-1]) as sorted(iterable, reverse=True)
Rewrite reversed(iterable[::-1]) as iterable

C416: Unnecessary `<list/set>` comprehension - rewrite using `<list/set>`().

It's unnecessary to use a list comprehension if the elements are unchanged. The iterable should be wrapped in list() or set() instead. For example:

Rewrite [x for x in iterable] as list(iterable)
Rewrite {x for x in iterable} as set(iterable)

Comments

Incorrect C408 rule warning for dict with keyword arguments

The C408 rule seems to fail on a dict with keyword arguments, but from the description of the rule, it should not fail.

$ cat scratch.py 
a = dict(one=1, two=2, three=3)

$ flake8 --version
3.8.2 (flake8-bugbear: 20.1.4, flake8-comprehensions: 3.2.2, mccabe: 0.6.1, pycodestyle: 2.6.0, pyflakes: 2.2.0) CPython 3.8.2
on Darwin

$ flake8 ~/Library/Preferences/PyCharmCE2019.3/scratches/scratch.py 
scratch.py:1:5: C408 Unnecessary dict call - rewrite as a literal.

opened by deveshks 13

"Unnecessary dict call" false positives
The dict function has 3 forms: dict(**kwarg), dict(mapping, **kwarg), and dict(iterable, **kwarg). I'm not sure if it makes sense to apply this rule to every form.

Here's one example use case:

def foo(data): defaults = {'a': 1, 'b': 2} return dict(defaults, **data)

This example could be rewritten to use literals, but it would be less succinct and (I assume, but haven't confirmed) less performant.
opened by faulkner 10

test_C416_fail_1_list fails

When building package for openSUSE, the test suite fails in this one test test_C416_fail_1_list:

[   18s] =================================== FAILURES ===================================
[   18s] ____________________________ test_C416_fail_1_list _____________________________
[   18s]
[   18s] flake8dir = <pytest_flake8dir.Flake8Dir object at 0x7fa392a57160>
[   18s]
[   18s]     def test_C416_fail_1_list(flake8dir):
[   18s]         flake8dir.make_example_py("[x for x in range(5)]")
[   18s]         result = flake8dir.run_flake8()
[   18s]         # Column offset for list comprehensions was incorrect in Python < 3.8.
[   18s]         # See https://bugs.python.org/issue31241 for details.
[   18s]         col_offset = 1 if sys.version_info >= (3, 8) else 2
[   18s] >       assert result.out_lines == [
[   18s]             "./example.py:1:%d: C416 Unnecessary list comprehension - rewrite using list()."
[   18s]             % col_offset,
[   18s]         ]
[   18s] E       AssertionError: assert ['./example.p...sing list().'] == ['./example.p...sing list().']
[   18s] E         At index 0 diff: './example.py:1:1: C416 Unnecessary list comprehension - rewrite using list().' != './example.py:1:2: C416 Unnecessary list comprehension - rewrite using list().'
[   18s] E         Full diff:
[   18s] E           [
[   18s] E         -  './example.py:1:2: C416 Unnecessary list comprehension - rewrite using '
[   18s] E         ?                  ^
[   18s] E         +  './example.py:1:1: C416 Unnecessary list comprehension - rewrite using '
[   18s] E         ?                  ^...
[   18s] E
[   18s] E         ...Full output truncated (3 lines hidden), use '-vv' to show
[   18s]
[   18s] tests/test_flake8_comprehensions.py:840: AssertionError
[   18s] =========================== short test summary info ============================
[   18s] FAILED tests/test_flake8_comprehensions.py::test_C416_fail_1_list - Assertion...
[   18s] ======================== 1 failed, 63 passed in 13.74s =========================

Build log with all details.

opened by mcepl 9

Performance implications of C412
Hi,

We have some code that looks like:

def is_valid(vm: str) -> bool: return vm in {machine for machine in machines}

And we got this lint error:

C412: Unnecessary (dict/list/set) comprehension - ‘in’ can take a generator.

This code is not particularly performance sensitive, so using a set or a generator probably does not make a big difference (and in fact probably the generator is faster!), but I am concerned that in other cases, applying the lint suggestion might result in a O(n) search rather than an amortised constant time.

In the original PR (#166) @adamchainz mentioned that checking for inclusion in some generators, such as range is optimised and that converting to a list / set and checking for including incurs in more memory copies and slower time overall, but as far as I know this is not true in other cases such as the one we bumped into.

Maybe the lint message could be updated so folks are aware that it might have a performance penalty?

Cheers,
opened by javierhonduco 9

Add the `iter()` builtin to checker `C407`

`iter()`

Example:

even_iterator = iter([i for i in range(1,100) if i%2 == 0])

This can be directly written as:

even_iterator = iter(i for i in range(1,100) if i%2 == 0)

opened by sauravsrijan 9

Convert `map(lambda x: expression, iterable)` to `(expression for x in iterable)`

Description

map(f, iterable) has great performance when f is a built-in function, and it makes sense if your function already has a name. But if you need to introduce a lambda, it's better (i.e., more readable and faster) to use a generator expression -- no function calls needed.

Good:

for y in map(my_function_that_already_exists, it):
    ...
ys = list(map(my_function_that_already_exists, it))
ys = set(map(my_function_that_already_exists, it))
etc.

Suggestions:

# Bad:
map(lambda x: x**2, iterable)
# change to:
(x**2 for x in iterable)

# Bad:
list(map(lambda x: x.attr, iterable))
# change to:
[x.attr for x in iterable]
# OR change to:
y = list(map(operator.attrgetter("attr"), iterable))

# Bad:
z = map(lambda x: f(x), iterable)
# change to:
z = map(f, iterable)

Here are some perf numbers (deque(..., maxlen=0) is a fast way to consume an iterable):

PS C:\Users\sween> py -3.8 -m pyperf timeit -s "from collections import deque" "deque(map(lambda x: -x, range(1_000_000)), maxlen=0)"
Mean +- std dev: 85.7 ms +- 1.7 ms

PS C:\Users\sween> py -3.8 -m pyperf timeit -s "from collections import deque" "deque((-x for x in range(1_000_000)), maxlen=0)"

Mean +- std dev: 63.1 ms +- 2.9 ms

Similar things hold for filter().

opened by sweeneyde 7

Add __slots__ to plugin class to reduce memory footprint

Hi!

First of all, thanks for creating this plugin! I've been using it for years now and it's great 👏

This PR is more of a question than a clear suggested improvement; I thought about raising an issue to discuss it, but a PR seemed more actionable in case you agree with the rationale.

Changes

So the PR simply defines __slots__ on the plugin class, to pre-declare the class attributes we expect and in turn, make it a little bit more performant.

I did some memory profiling using memory_profiler and from what I can tell, this reduces the memory footprint of the plugin class by 15-20%. I also tested using sys.getsizeof and found basically the same results there.

The drawback of declaring __slots__, is of course that it prevents you from setting new instance variables during runtime, but as far as I can tell (and I could be wrong), flake8 never does this. I don't know of any other flake8-plugins that have this declared, so there could be a reason why this is a bad idea, but they are not clear to me, and the test-suite seems to have passed 🎉

Curious to see what your take on this is 😌

Thanks again 👏

opened by sondrelg 7
Python 2.6 mode

It would be nice if a project can specify that they need '2.6' support, and only rules that dont break 2.6 support would be active.

For example, the set rules in #2 rely on Python 2.7 syntax. As a result, we need to turn off those codes, despite those codes having many good rules that are applicable and valid on Python 2.6, because a few rules are not.

https://gerrit.wikimedia.org/r/#/c/302103/

It would be especially lovely if you supported min-version = 2.6 that flake8-future-import uses, which also allows for other versions to have specific syntax that affects these rules.

opened by jayvdb 7
Unnecessary list literal that should be a tuple

This isnt strictly a case for comprehension, but seems to fit nicely with your other errors. sorted([1, 3, 2]) is an unnecessary list, that should be replaced with a tuple as the iterable processor sorted (and any other readonly processor) doesnt need a list.

i.e. this should be

sorted((1, 3, 2))

opened by jayvdb 6
False type for tree

Python Version

3.7.1

flake8 Version

3.9.2

Package Version

latest

Description

Hi, thanks for adding type hints to the project. The type for ComprehensionChecker.__init__.tree is ast.Module, but should be ast.AST.

I'm happy to open a PR if you like

opened by kasium 5
C407 increases laziness which should be noted since it can change behaviour
The following issues the C407 warning (as desired)

results = all([m.delete() for m in q.where_in(self._other_key, ids).get()])

however if you appease it and switch the code to

results = all((m.delete() for m in q.where_in(self._other_key, ids).get()))

or

results = all(m.delete() for m in q.where_in(self._other_key, ids).get())

It causes a subtle bug due to the new laziness gained by the generator expression

This can be evidenced here

>>> all((print(i) for i in range(1, 10))) 1 False >>> all((print(i) or True for i in range(1, 10))) 1 2 3 4 5 6 7 8 9 True

This warning should probably remind/warn (haha) the user that there is a risk in changing it to follow the warning if the generate has a side effecting function in it
opened by merc1031 5
Feature request: Inverse of C417: check for `f(x) for x in iterable`

Description

(Thanks for flake8-comprehensions, it's a good plugin) Check for f(x) for x in iterable in any comprehension, and suggest replacing it with map(f, iterable). This is not objectively better like some other rules are, but IMO this is cleaner.

opened by GideonBear 3
any/all short circuiting
Description

List comprehension can postpone the evaluation of any/all, which can hurt performance for larger iterables. Surprisingly, this was not yet included here as rule.

Example:

def hi(): print('hi') return True >>> any(hi() for num in [1, 2, 3, 4]) hi >>> any([hi() for num in [1, 2, 3, 4]]) hi hi hi hi

From this answer

Proposal

Extend the current set of rules with C418 and C419 (any, all), similar to C403 and C404
opened by sbrugman 2
Extend C41x to detect for _ in list(dict.values())
Hello, I inherited a code with a lot of pattern like this:

for val in list(some_dict.values()): do_something(val)

As well as the list comprehension counterpart. As this plugin helped me a lot to catch a lot of other list comprehension and list/set/dict abuse, I wondered if such a pattern could be added to C411 (or another code).
opened by Lattay 2
Add zip to C407

zip https://docs.python.org/3/library/functions.html?highlight=zip#zip

How come I forgot to mention this one in the last issue I don't know, since I use it quite often :)

opened by joaoe 0