This is an implementation of PEP 557, Data Classes.

Overview

This is an implementation of PEP 557, Data Classes. It is a backport for Python 3.6. Because dataclasses will be included in Python 3.7, any discussion of dataclass features should occur on the python-dev mailing list at https://mail.python.org/mailman/listinfo/python-dev. At this point this repo should only be used for historical purposes (it's where the original dataclasses discussions took place) and for discussion of the actual backport to Python 3.6.

See https://www.python.org/dev/peps/pep-0557/ for the details of how Data Classes work.

A test file can be found at https://github.com/ericvsmith/dataclasses/blob/master/test/test_dataclasses.py, or in the sdist file.

Installation

pip install dataclasses

Example Usage

from dataclasses import dataclass

@dataclass
class InventoryItem:
    name: str
    unit_price: float
    quantity_on_hand: int = 0

    def total_cost(self) -> float:
        return self.unit_price * self.quantity_on_hand

item = InventoryItem('hammers', 10.49, 12)
print(item.total_cost())

Some additional tools can be found in dataclass_tools.py, included in the sdist.

Compatibility

This backport assumes that dict objects retain their insertion order. This is true in the language spec for Python 3.7 and greater. Since this is a backport to Python 3.6, it raises an interesting question: does that guarantee apply to 3.6? For CPython 3.6 it does. As of the time of this writing, it's also true for all other Python implementations that claim to be 3.6 compatible, of which there are none. Any new 3.6 implementations are expected to have ordered dicts. See the analysis at the end of this email:

https://mail.python.org/pipermail/python-dev/2017-December/151325.html

As of version 0.4, this code no longer works with Python 3.7. For 3.7, use the built-in dataclasses module.

Release History

Version Date Description
0.8 2020-11-13 Fix ClassVar in .replace()
0.7 2019-10-20 Require python 3.6 only
0.6 2018-05-17 Equivalent to Python 3.7.0rc1
0.5 2018-03-28 Equivalent to Python 3.7.0b3
Comments
  • Copying mutable defaults

    Copying mutable defaults

    Guido and I discussed this yesterday, and we decided we'd just copy.copy() the default values when creating a new instance.

    The __init__ code I'm currently generating for:

    @dataclass
    class C:
        x: list = []
    

    Looks something like:

    def __init__(self, x=[]):
        self.x = copy.copy(x)
    

    But I don't think this is what we really want. I don't think we want to call copy.copy() if passed in an unrelated list, like C(x=mylist). Maybe __init__ should check and only call copy.copy() if x is C.x?

    So:

    def __init__(self, x=C.x):
        self.x = copy.copy(x) if x is C.x else x
    

    ?

    (I haven't checked that this actually works as written, but the idea is to only copy the argument if it's the same object as the default that was assigned in the class creation statement.)

    opened by ericvsmith 47
  • Improve asdict and astuple

    Improve asdict and astuple

    This PR adds two keyword arguments copy_fields and nested to asdict() and astuple(). I think the first one should be on by default, so that simple mutable fields (like lists) will be copied by asdict() and astuple().

    As well, as discussed in https://github.com/ericvsmith/dataclasses/pull/26, this adds dict_factory keyword argument, so that a user can use OrderedDict or other custom mapping.

    opened by ilevkivskyi 27
  • Should it be possible to pass parameter(s) to the post-init function?

    Should it be possible to pass parameter(s) to the post-init function?

    Currently the post-init function __dataclass_post_init__ takes no parameters. Should it be possible to pass one or more parameters from __init__ to the post-init function? Is it possible to make the parameter optional?

    It would be nice to pass parameters to __init__ which do not initialize fields, but are accessible to the post-init function. But, it might mess up the interface too much, and there are issues with calling __dataclass_post_init__'s super() function if we decide to change the number of parameters.

    This issue is a placeholder to decide this issue before the code and PEP are finalized. Any suggestions or thoughts are welcome.

    opened by ericvsmith 27
  • On naming

    On naming

    As I’ve already mentioned by e-mail, I’m strongly opposed to call this concept “data classes”.

    Having an easy way to define many small class with attributes is nothing about data, it’s about good OO design.

    Calling it “data classes” implies that they differ from…“code classes” I guess?

    One of the things people love about attrs is that it’s helping them to write regular classes which they can add methods to without any subclassing or other magic. IOW: to focus on the actual code they want to write as opposed to generic boilerplate.

    Debasing them by name seems like a poor start to me. We do have data containers in the stdlib (namedtuples, SimpleNamespace) so I don’t see a reason to add a third to the family – even if just by name.

    opened by hynek 27
  • why not just attrs?

    why not just attrs?

    I don't mean this as a dig at the project here, but a serious question:

    Is there any reason this project should exist?

    Every aspect of the design here appears to be converging further in the direction of exact duplication of functionality available within attrs, except those which are features already on attrs's roadmap. The attrs maintainers have carefully considered a number of subtle issues related to this problem domain, and every discussion I have observed thus far on the dataclasses repo has paralleled the reasoning process that went into attrs's original design or its maintenance. (Remembering, also, that attrs is itself a second-generation project and there was a lot of learning that came from characteristic.)

    I haven't had time to read all of the python-ideas thread, but all of the objections to attrs that I could find seem to have to do with whimsical naming or a desire to shoehorn convenient class creation into a "special" role that is just for "data" and not for regular classes somehow. I shouldn't be passive-aggressive here, so I should just say: I can't follow Nick's reasoning on #1 at all :-).

    The silly names could be modified by a trivially tiny fork, if that is really a deal-breaker for stdlib adoption; honestly I find that the names grow on you. (More than one attrs user has separately come up with the idea that it is a lot like Python's significant whitespace.)

    That said, of course there may be some reasons or some broader goal that I'm missing, but if this is the case it seems like writing a very clear "goals / non-goals / rejected approaches" section for the PEP itself would be worthwhile. The reasons given in the PEP don't really make sense; the need to support python 2 hasn't been a drag on python 3 that I'm aware of, and annotation-based attribute definition is coming to attrs itself; it's a relatively small extension.

    opened by glyph 21
  • mypy==0.620 compatibility issue

    mypy==0.620 compatibility issue

    Versions

    • Python 3.6.5

    • mypy==0.620+dev.bc0b551d8df2baadc44f0c3b0b801fcc12119658

    • dataclasses==0.6

    What's wrong

    $ cat f2.py 
    from dataclasses import dataclass
    print('works!')
    $ python f2.py 
    works!
    $ mypy f2.py 
    f2.py:1: error: Cannot find module named 'dataclasses'
    f2.py:1: note: (Perhaps setting MYPYPATH or using the "--ignore-missing-imports" flag would help)
    

    So the natural way is to use MYPYPATH to point to the specific package, dataclasses in this case. The problem is dataclasses package is installed directly in site-packages (virtual_env_path/lib/python3.6/site-packages/dataclasses.py in my case), and it's an error (and a bad idea) to pass the whole site-packages to MYPYPATH.

    When I manually move the file from /site-packages/dataclasses.py to /site-packages/dataclasses/dataclasses.py and pass the path to MYPYPATH it's working as expected: no mypy error. Unfortunately, this is not the solution I can use.

    I reported this issue on mypy page (https://github.com/python/mypy/issues/5342) and was told that:

    The backport package can fix this by including stubs (presumably the same ones that are in typeshed) and marking itself as typed by using py.typed.

    Can anything be done about it? I cannot migrate to Python 3.7 yet and would benefit much from mypy actually understanding dataclasses.

    opened by pawelswiecki 20
  • differences / compatibility with attrs project

    differences / compatibility with attrs project

    It would be helpful to have a list of functional differences between dataclasses and attrs, broken down by @dataclass vs @attr.s and field vs attr.ib.

    This would be useful and illuminating for a few reasons:


    It would make it easier to vet the logic behind, and need for, each of the proposed differences.

    @hynek and @Tinche have invested years of thought into the current design: deviating from it without fully understanding the history and reasoning behind each decision might lead to this project needlessly repeating mistakes. I'm glad to see that the attrs devs have already been brought into several issues. My hope is we can get a bird's eye view so that nothing slips through the cracks.


    If the differences aren't too great (and ideally they will not be, see above) I'd like to see a dataclass compatibility mode for attrs (e.g. from attrs import dataclass, field).

    I'm glad that this badly-needed feature is being worked on, but sadly I'm stuck in python 2 for at least another 2 years, so it's important to me, and surely many attrs-users, to have an easy path to adoption once this becomes part of stdlib.

    opened by chadrik 20
  • How to specify factory functions

    How to specify factory functions

    In #3, we discussed mutable defaults which included talk of factory functions. Having resolved #3, this issue is about how to specify default value factory functions.

    Some options:

    1. A new parameter to field(). For now, let's call it factory:
       @dataclass
        class Foo:
            x: list = field(factory=list, repr=False)
    

    It would be an error to specify both default= and factory=.

    1. Re-use the default parameter to field(), marking it as a factory function so we can determine whether to call it. For now, let's mark it by using Factory(callable).
        @dataclass
        class Foo:
            x: list = field(default=Factory(list), repr=False)
    
    1. Have a separate flavor of field used with factory functions. For now, let's assume it's called factory_field. It would not have a default= parameter:
        @dataclass
        class Foo:
           x: list = factory_field(list, repr=False)
    

    I don't have a real preference among these. I sense we should go with whatever makes mypy's life easier, but I don't know what that would be. I suspect it would be 3, since I think mypy could be told that the type of factory_field(callable) is the type of callable(). But I'm just hypothesizing, and am interested in the opinion of experts.

    opened by ericvsmith 19
  • doc string for __init__

    doc string for __init__

    From a discussion with @raymondh:

    Should we auto-generate some sort of doc string for __init__, or allow the user to specify one (maybe as a param to @dataclass)?

    I'm not sure a generated one wouldn't have too much noise to be useful.

    opened by ericvsmith 18
  • Add module helper function that provides access to all fields.

    Add module helper function that provides access to all fields.

    Since we decided in issue #8 to use module level helper functions instead of instance methods, I want to add the first such function.

    dataclasses.fields(cls) will return a tuple of Field objects defined in cls. Each Field object represents one field in the class.

    This will be the basic building block for a number of introspection methods.

    attrs returns object that can be either indexed or accessed by field name. I think that's a good idea, but I'm not going to implement it at first.

    opened by ericvsmith 16
  • Support __slots__?

    Support __slots__?

    Currently the draft PEP specifies and the code supports the optional ability to add __slots__. This is the one place where @dataclass cannot just modify the given class and return it: because __slots__ must be specified at class creation time, it's too late by the time the dataclass decorator gets control. The current approach is to dynamically generate a new class while setting __slots__ in the new class and copying over other class attributes. The decorator then returns the new class.

    The question is: do we even want to support setting __slots__? Is having __slots__ important enough to have this deviation from the "we just add a few dunder methods to your class" behavior?

    I see three options:

    1. Leave it as-is, with @dataclass(slots=True) returning a new class.
    2. Completely remove support for setting __slots__.
    3. Add a different decorator, say @add_slots, which takes a data class and creates a new class with __slots__ set.

    I think we should either go with 2 or 3. I don't mind not supporting __slots__, but if we do want to support it, I think it's easier to explain with a separate decorator.

    @add_slots
    @dataclass
    class C:
        x: int
        y: int
    

    It would be an error to use @add_slots on a non-dataclass class.

    opened by ericvsmith 15
  • Support kw_only?

    Support kw_only?

    New in version 3.10. kw_only: If true (the default value is False), then all fields will be marked as keyword-only. If a field is marked as keyword-only, then the only effect is that the init() parameter generated from a keyword-only field must be specified with a keyword when init() is called. There is no effect on any other aspect of dataclasses. See the parameter glossary entry for details. Also see the KW_ONLY section.

    Source: https://docs.python.org/3/library/dataclasses.html

    It solves the limitation that a base class can't hold default values.

    Example: https://stackoverflow.com/questions/51575931/class-inheritance-in-python-3-7-dataclasses

    opened by guysoft 0
  • dataclasses.replace raises exception if InitVars with default argument is not provided

    dataclasses.replace raises exception if InitVars with default argument is not provided

    This is the same issue as https://bugs.python.org/issue36470, I open a new issue here for greater visibility.

    Basically if one has a dataclass which defines an InitVars with a default value, and then uses dataclasses.replace on an instance without passing the init var as parameter to replace, a ValueError: InitVar '...' must be specified with replace() error is raised.

    This is bad, because the InitVar already defines its default value, there's no need to expect a value when using replace(). It makes it impossible to use replace() at the same time as default InitVars.

    The fix is trivial and there are already two PRs to upstream cpython dataclasses.

    https://github.com/python/cpython/pull/17441 https://github.com/python/cpython/pull/20867

    opened by anthrotype 0
  • add_slots tool breaks pickling of frozen dataclasses & proposed solution

    add_slots tool breaks pickling of frozen dataclasses & proposed solution

    Although not part of the official API, I find add_slots to be quite useful. However, the combination with frozen dataclasses and pickling causes problems:

    @add_slots
    @dataclass(frozen=True)
    class ExampleDataclass:
        foo: str
        bar: int
    
    
    
    assert ExampleDataclass.__slots__ == ("foo", "bar")
    
    assert pickle.loads(
        pickle.dumps(ExampleDataclass("a", 1))
    ) == ExampleDataclass("a", 1)
    

    gives the following error:

    dataclasses.FrozenInstanceError: cannot assign to field 'foo'
    

    A quick fix would be to add something like this:

    def _dataclass_getstate(self):
        return [getattr(self, f.name) for f in fields(self)]
    
    
    def _dataclass_setstate(self, state):
        for field, value in zip(fields(self), state):
            # use setattr because dataclass may be frozen
            object.__setattr__(self, field.name, value)
    
    
    def add_slots(cls):
        ...  # existing add_slots code here...
        # optionally only do these steps if the dataclass is frozen
        cls.__getstate__ = _dataclass_getstate
        cls.__setstate__ = _dataclass_setstate
        return cls
    

    edit: typos

    opened by ariebovenberg 0
  • Problems with py37

    Problems with py37

    With the new py version restriction, pip will now as a result install dataclasses 0.6 if install is attempted on py37, since this is the last available version that satisfies this constraint. This might be a problem, since it shadows the builtin version. The reason this comes up is packaging a module the depends on dataclasses, but only python>=3.6, becomes tricky to do correctly. I believe the correct line is:

    dataclasses>='0.7';python_version<'3.7'
    

    Ideally, I think it would be nice if this dataclasses module simply returns the builtin python module if it's available (although I'm not sure how to implement that in practice!) If this isn't possible/desirable, perhaps you could add to the readme/docs something about how to install only on python<3.7 (e.g. using the above line)?

    opened by jph00 1
  • dataclasses.astuple breaks NamedTuple attribute of dataclass instance

    dataclasses.astuple breaks NamedTuple attribute of dataclass instance

    It seems the fix of https://bugs.python.org/issue34363 is missing in dataclasses version 0.7 backported to python 3.6

    # 3.6.8 (default, Feb 28 2019, 22:12:13)
    # [GCC 8.2.1 20181127](6, 0, 1)
    # dataclasses 0.7
    
    from dataclasses import InitVar, astuple, dataclass, field
    from typing import NamedTuple, Tuple
    import numpy as np
    
    class A(NamedTuple):
        x: float = np.nan
        y: float = np.nan
    
    @dataclass
    class B:
        tuple_a: InitVar[Tuple] = None
        named_tuple_a: A = field(init=False, default=A())
    
        def __post_init__(self, tuple_a):
            if tuple_a is not None:
                self.named_tuple_a = A(*tuple_a)
    
    b = B()
    astuple(b)
    
    >> (A(x=<generator object _astuple_inner.<locals>.<genexpr> at 0x7f7dec0902b0>, y=nan),)
    

    While the same code in Python 3.7.4 gives:

    # 3.7.4 (default, Oct  4 2019, 06:57:26)
    # [GCC 9.2.0] (6, 0, 1)
    
    from dataclasses import InitVar, astuple, dataclass, field
    from typing import NamedTuple, Tuple
    import numpy as np
    
    class A(NamedTuple):
        x: float = np.nan
        y: float = np.nan
    
    @dataclass
    class B:
        tuple_a: InitVar[Tuple] = None
        named_tuple_a: A = field(init=False, default=A())
    
        def __post_init__(self, tuple_a):
            if tuple_a is not None:
                self.named_tuple_a = A(*tuple_a)
    
    b = B()
    astuple(b)
    
    >> (A(x=nan, y=nan),)
    
    opened by isvoboda 1
  • asdict breaks with defaultdicts

    asdict breaks with defaultdicts

    https://bugs.python.org/issue35540 is the official bug report on the stdlib implementation in 3.7, copied below. There's a proposed fix at https://github.com/python/cpython/pull/11361 to special-case defaultdict.

    _asdict_inner attempts to manually recursively deepcopy dicts by calling type(obj) with a generator of transformed keyvalue tuples @ https://github.com/python/cpython/blob/b2f642ccd2f65d2f3bf77bbaa103dd2bc2733734/Lib/dataclasses.py#L1080 . defaultdicts are dicts so this runs but unlike other dicts their first arg has to be a callable or None:

    import collections
    import dataclasses as dc
    
    @dc.dataclass()
    class C:
        d: dict
    
    c = C(collections.defaultdict(lambda: 3, {}))
    d = dc.asdict(c)
    
    assert isinstance(d['d'], collections.defaultdict)
    assert d['d']['a'] == 3
    

    =>

    Traceback (most recent call last):
      File "boom.py", line 9, in <module>
        d = dc.asdict(c)
      File "/Users/spinlock/.pyenv/versions/3.7.1/lib/python3.7/dataclasses.py", line 1019, in asdict
        return _asdict_inner(obj, dict_factory)
      File "/Users/spinlock/.pyenv/versions/3.7.1/lib/python3.7/dataclasses.py", line 1026, in _asdict_inner
        value = _asdict_inner(getattr(obj, f.name), dict_factory)
      File "/Users/spinlock/.pyenv/versions/3.7.1/lib/python3.7/dataclasses.py", line 1058, in _asdict_inner
        for k, v in obj.items())
    TypeError: first argument must be callable or None
    

    I understand that it isn't this bit of code's job to support every dict (and list etc.) subclass under the sun but given defaultdict is stdlib it's imo worth supporting explicitly.

    opened by seansfkelley 0
Owner
Eric V. Smith
Eric V. Smith
This package tries to emulate the behaviour of syntax proposed in PEP 671 via a decorator

Late-Bound Arguments This package tries to emulate the behaviour of syntax proposed in PEP 671 via a decorator. Usage Mention the names of the argumen

Shakya Majumdar 0 Feb 6, 2022
Simple but maybe too simple config management through python data classes. We use it for machine learning.

??‍✈️ Coqpit Simple, light-weight and no dependency config handling through python data classes with to/from JSON serialization/deserialization. Curre

coqui 67 Nov 29, 2022
Macros in Python: quasiquotes, case classes, LINQ and more!

MacroPy3 1.1.0b2 MacroPy is an implementation of Syntactic Macros in the Python Programming Language. MacroPy provides a mechanism for user-defined fu

Li Haoyi 3.2k Jan 6, 2023
Python Classes Without Boilerplate

attrs is the Python package that will bring back the joy of writing classes by relieving you from the drudgery of implementing object protocols (aka d

The attrs Cabal 4.6k Jan 2, 2023
Simple tooling for marking deprecated functions or classes and re-routing to the new successors' instance.

pyDeprecate Simple tooling for marking deprecated functions or classes and re-routing to the new successors' instance

Jirka Borovec 45 Nov 24, 2022
An extended version of the hotkeys demo code using action classes

An extended version of the hotkeys application using action classes. In adafruit's Hotkeys code, a macro is using a series of integers, assumed to be

Neradoc 5 May 1, 2022
Izy - Python functions and classes that make python even easier than it is

izy Python functions and classes that make it even easier! You will wonder why t

null 5 Jul 4, 2022
Pyrmanent - Make all your classes permanent in a flash 💾

Pyrmanent A base class to make your Python classes permanent in a flash. Features Easy to use. Great compatibility. No database needed. Ask for new fe

Sergio Abad 4 Jan 7, 2022
Python Interactive Graphical System made during Computer Graphics classes (INE5420-2021.1)

PY-IGS - The PYthon Interactive Graphical System The PY-IGS Installation To install this software you will need these dependencies (with their thevelo

Enzo Coelho Albornoz 4 Dec 3, 2021
On this repo, you'll find every codes I made during my NSI classes (informatical courses)

??‍?? ??‍?? school-codes On this repo, you'll find every codes I made during my NSI classes (informatical courses) French for now since this repo is d

EDM 1.15 3 Dec 17, 2022
This is a backport of the BaseExceptionGroup and ExceptionGroup classes from Python 3.11.

This is a backport of the BaseExceptionGroup and ExceptionGroup classes from Python 3.11. It contains the following: The exceptiongroup.BaseExceptionG

Alex Grönholm 19 Dec 15, 2022
WATTS provides a set of Python classes that can manage simulation workflows for multiple codes where information is exchanged at a coarse level

WATTS (Workflow and Template Toolkit for Simulation) provides a set of Python classes that can manage simulation workflows for multiple codes where information is exchanged at a coarse level.

null 13 Dec 23, 2022
A Pythonic Data Catalog powered by Ray that brings exabyte-level scalability and fast, ACID-compliant, change-data-capture to your big data workloads.

DeltaCAT DeltaCAT is a Pythonic Data Catalog powered by Ray. Its data storage model allows you to define and manage fast, scalable, ACID-compliant dat

null 45 Oct 15, 2022
Viewflow is an Airflow-based framework that allows data scientists to create data models without writing Airflow code.

Viewflow Viewflow is a framework built on the top of Airflow that enables data scientists to create materialized views. It allows data scientists to f

DataCamp 114 Oct 12, 2022
resultados (data) de elecciones 2021 y código para extraer data de la ONPE

elecciones-peru-2021-ONPE Resultados (data) de elecciones 2021 y código para extraer data de la ONPE Data Licencia liberal, pero si vas a usarlo por f

Ragi Yaser Burhum 21 Jun 14, 2021
Yunqi Chen 7 Oct 30, 2022
An unofficial python API for trading on the DeGiro platform, with the ability to get real time data and historical data.

DegiroAPI An unofficial API for the trading platform Degiro written in Python with the ability to get real time data and historical data for products.

Jorrick Sleijster 5 Dec 16, 2022
Improve current data preprocessing for FTM's WOB data to analyze Shell and Dutch Governmental contacts.

We're the hackathon leftovers, but we are Too Good To Go ;-). A repo by Lukas Schubotz and Raymon van Dinter. We aim to improve current data preprocessing for FTM's WOB data to analyze Shell and Dutch Governmental contacts.

ASReview hackathon for Follow the Money 5 Dec 9, 2021
Python for downloading model data (HRRR, RAP, GFS, NBM, etc.) from NOMADS, NOAA's Big Data Program partners (Amazon, Google, Microsoft), and the University of Utah Pando Archive System.

Python for downloading model data (HRRR, RAP, GFS, NBM, etc.) from NOMADS, NOAA's Big Data Program partners (Amazon, Google, Microsoft), and the University of Utah Pando Archive System.

Brian Blaylock 194 Jan 2, 2023