jax_dataclasses
jax_dataclasses
provides a wrapper around dataclasses.dataclass
for use in JAX, which enables automatic support for:
- Pytree registration. This allows dataclasses to be used at API boundaries in JAX. (necessary for function transformations, JIT, etc)
- Serialization via
flax.serialization
.
Notably, jax_dataclasses
is designed to work seamlessly with static analysis, including tools like mypy
and jedi
.
Heavily influenced by some great existing work; see Alternatives for comparisons.
Installation
pip install jax_dataclasses
Core interface
jax_dataclasses
is meant to provide a drop-in replacement for dataclasses.dataclass
:
jax_dataclasses.pytree_dataclass
has the same interface asdataclasses.dataclass
, but also registers the target class as a pytree container.jax_dataclasses.static_field
has the same interface asdataclasses.field
, but will also mark the field as static. In a pytree node, static fields will be treated as part of the treedef instead of as a child of the node; all fields that are not explicitly marked static should contain arrays or child nodes.
We also provide several aliases: jax_dataclasses.[field, asdict, astuples, is_dataclass, replace]
are all identical to their counterparts in the standard dataclasses library.
Mutations
All dataclasses are automatically marked as frozen and thus immutable (even when no frozen=
parameter is passed in). To make changes to nested structures easier, we provide an interface that will (a) make a copy of a pytree and (b) return a context in which any of that copy's contained dataclasses are temporarily mutable:
from jax import numpy as jnp
import jax_dataclasses
@jax_dataclasses.pytree_dataclass
class Node:
child: jnp.ndarray
obj = Node(child=jnp.zeros(3))
with jax_dataclasses.copy_and_mutate(obj) as obj_updated:
# Make mutations to the dataclass. This is primarily useful for nested
# dataclasses.
#
# Also does input validation: if the treedef, leaf shapes, or dtypes of `obj`
# and `obj_updated` don't match, an AssertionError will be raised.
# This can be disabled with a `validate=False` argument.
obj_updated.child = jnp.ones(3)
print(obj)
print(obj_updated)
Alternatives
A few other solutions exist for automatically integrating dataclass-style objects into pytree structures. Great ones include: chex.dataclass
, flax.struct
, and tjax.dataclass
. These all influenced this library.
The main differentiators of jax_dataclasses
are:
-
Static analysis support. Libraries like
dataclasses
andattrs
rely on tooling-specific custom plugins for static analysis, which don't exist forchex
orflax
.tjax
has a custom mypy plugin to enable type checking, but isn't supported by other tools. Because@jax_dataclasses.pytree_dataclass
has the same API as@dataclasses.dataclass
, it can include pytree registration behavior at runtime while being treated as the standard decorator during static analysis. This means that all static checkers, language servers, and autocomplete engines that support the standarddataclasses
library should work out of the box withjax_dataclasses
. -
Nested dataclasses. Making replacements/modifications in deeply nested dataclasses is generally very frustrating. The three alternatives all introduce a
.replace(self, ...)
method to dataclasses that's a bit more convenient than the traditionaldataclasses.replace(obj, ...)
API for shallow changes, but still becomes really cumbersome to use when dataclasses are nested.jax_dataclasses.copy_and_mutate()
is introduced to address this. -
Static field support. Parameters that should not be traced in JAX should be marked as static. This is supported in
flax
,tjax
, andjax_dataclasses
, but notchex
. -
Serialization. When working with
flax
, being able to serialize dataclasses is really handy. This is supported inflax.struct
(naturally) andjax_dataclasses
, but notchex
ortjax
.
Misc
This code was originally written for and factored out of jaxfg, where Nick Heppert provided valuable feedback!