A lightweight library for converting complex objects to and from simple Python datatypes.

Overview

marshmallow: simplified object serialization

Latest version Build status Documentation code style: black

marshmallow is an ORM/ODM/framework-agnostic library for converting complex datatypes, such as objects, to and from native Python datatypes.

from datetime import date
from pprint import pprint

from marshmallow import Schema, fields


class ArtistSchema(Schema):
    name = fields.Str()


class AlbumSchema(Schema):
    title = fields.Str()
    release_date = fields.Date()
    artist = fields.Nested(ArtistSchema())


bowie = dict(name="David Bowie")
album = dict(artist=bowie, title="Hunky Dory", release_date=date(1971, 12, 17))

schema = AlbumSchema()
result = schema.dump(album)
pprint(result, indent=2)
# { 'artist': {'name': 'David Bowie'},
#   'release_date': '1971-12-17',
#   'title': 'Hunky Dory'}

In short, marshmallow schemas can be used to:

  • Validate input data.
  • Deserialize input data to app-level objects.
  • Serialize app-level objects to primitive Python types. The serialized objects can then be rendered to standard formats such as JSON for use in an HTTP API.

Get It Now

$ pip install -U marshmallow

Documentation

Full documentation is available at https://marshmallow.readthedocs.io/ .

Requirements

  • Python >= 3.5

Ecosystem

A list of marshmallow-related libraries can be found at the GitHub wiki here:

https://github.com/marshmallow-code/marshmallow/wiki/Ecosystem

Credits

Contributors

This project exists thanks to all the people who contribute.

You're highly encouraged to participate in marshmallow's development. Check out the Contributing Guidelines to see how you can help.

Thank you to all who have already contributed to marshmallow!

Contributors

Backers

If you find marshmallow useful, please consider supporting the team with a donation. Your donation helps move marshmallow forward.

Thank you to all our backers! [Become a backer]

Backers

Sponsors

Support this project by becoming a sponsor (or ask your company to support this project by becoming a sponsor). Your logo will show up here with a link to your website. [Become a sponsor]

Sponsors Become a sponsor

Professional Support

Professionally-supported marshmallow is now available through the Tidelift Subscription.

Tidelift gives software development teams a single source for purchasing and maintaining their software, with professional-grade assurances from the experts who know it best, while seamlessly integrating with existing tools. [Get professional support]

Get supported marshmallow with Tidelift

Security Contact Information

To report a security vulnerability, please use the Tidelift security contact. Tidelift will coordinate the fix and disclosure.

Project Links

License

MIT licensed. See the bundled LICENSE file for more details.

Issues
  • Add option to throw validation error on extra keys

    Add option to throw validation error on extra keys

    I'm not sure if I'm missing something, but I would like to throw error if extra keys are provided, as in:

    >>> class AlbumSchema(Schema):
    ...     title = fields.Str()
    
    >>> AlbumSchema(strict=True).load({'extra': 2}, allow_extra=False)
    Traceback (most recent call last):
    ...
    marshmallow.exceptions.ValidationError: {'_schema': ["Extra arguments passed: ['extra']"]}
    

    I've been using the following implementation (taken from https://github.com/sloria/webargs/issues/87#issuecomment-183949558):

    class BaseSchema(Schema):
    
        @pre_load
        def validate_extra(self, in_data):
            if not isinstance(in_data, dict):
                return
    
            extra_args = [key for key in in_data.keys() if key not in self.fields]
            if extra_args:
                raise ValidationError('Extra arguments passed: {}'.format(extra_args))
    

    I would expect this to be a common need, however, so could it be supported by the library out-of-the-box?

    enhancement feedback welcome 
    opened by tuukkamustonen 47
  • Field validators as schema methods?

    Field validators as schema methods?

    Often, validators and preprocessors only apply to a single Schema, so it may make sense to define them within the Schema--rather than as separate functions--for better cohesion.

    Method names could be passed to __validators__, et. al, like so:

    class UserSchema(Schema):
        __validators__ = ['validate_schema']
    
        def validate_schema(self, data):
            # ...
    
    enhancement feedback welcome 
    opened by sloria 37
  • Handle unknown fields with EXCLUDE, INCLUDE or RAISE

    Handle unknown fields with EXCLUDE, INCLUDE or RAISE

    This is a rework of #595 so that it supports the API discussed in #524.

    By default, only the fields described in the schema are returned when data is deserialized.

    This commit implements an unknown option, so that Schema(unknown=ALLOW) or Schema().load(data, unknown=ALLOW) permit users to receive the unknown fields from the data.

    Schema(unknown=RAISE) or Schema().load(data, unknown=RAISE) will raise a ValidationError for each unknown field.

    Schema(unknown=IGNORE) or Schema().load(data, unknown=IGNORE) keep the original behavior.

    Edit: ALLOW will finally be INCLUDE, and IGNORE will be EXCLUDE.

    opened by ramnes 36
  • Allow callables for fields.Nested

    Allow callables for fields.Nested

    Suggested earlier here: https://github.com/marshmallow-code/marshmallow/issues/19#issuecomment-43685498.

    The suggestion would be to allow:

    foo = fields.Nested(lambda: Foo)
    

    in addition to

    foo = fields.Nested('Foo')
    

    Seems a bit cleaner in general.

    opened by taion 36
  • Processing API for dumping does not match API for loading

    Processing API for dumping does not match API for loading

    When dumping, I want to manipulate some of the attributes on the object being dumped before actually dumping. So I decorate a preprocessor function. But that only gets used when loading. I could use a Method field, but the data returned from that must be the final serialized form, so there's no way to specify that it's actually a Nested(OtherSchema, many=True) unless I do that serialization manually at the end of the method.

    When loading, I want to manipulate the loaded data in the exact opposite direction as the situation above. So I decorate a data_handler function. But that only gets used when dumping. The solution is more straightforward here: I override the make_object method to manipulate the final output.

    My point is that the names are different everywhere, some of them are decorators while others are methods, and they don't apply in both directions, when it would be very convenient to be able to do so. The loading situation is in better shape than the dumping one, since loading has preprocessor and make_object, except there's still a weird decorator vs. method difference.

    There should be a standard way to preprocess and postprocess the entire data during both loading and dumping. One solution is adding pre_dump, post_dump, pre_load, and post_load hooks, either as methods on the schema or as decorators.

    feedback welcome 
    opened by davidism 33
  • Nested 'only' parameter

    Nested 'only' parameter

    This allows to specify a nested only parameter both at schema instantiation time and during dump() like so:

    class ChildSchema(Schema):
        foo = fields.Field()
        bar = fields.Field()
        baz = fields.Field()
    
    class ParentSchema(Schema):
        bla = fields.Field()
        bli = fields.Field()
        blubb = fields.Nested(ChildSchema)
    
    data = dict(bla=1, bli=2, blubb=dict(foo=42, bar=24, baz=242))
    
    # either when instantiating
    sch = ParentSchema(only=('bla', ('blubb', ('foo', 'bar'))))
    result = sch.dump(data)
    
    #or when dumping
    sch = ParentSchema()
    result = sch.dump(data, only=('bla', ('blubb', ('foo', 'bar'))))
    

    It is fully backwards compatible. Fixes #402

    feedback welcome 
    opened by Tim-Erwin 33
  • Propose add required to schema constructor

    Propose add required to schema constructor

    For a put request model require of id field, but for post request id not needed. For disabling in a post request I use dump_only=["id"], but I have not found such a way to do id required. For this reason, I suggest adding required to the schema constructor.

    question 
    opened by heckad 30
  • Drop Python 2 support

    Drop Python 2 support

    ~~marshmallow 3 will be the last major version to support Python 2.~~ Per discussion below, Python 2 support will be dropped in marshmallow 3.

    backwards incompat 
    opened by sloria 30
  • How to create a Schema containing a dict of nested Schema?

    How to create a Schema containing a dict of nested Schema?

    Hi. I've been digging around and couldn't find the answer to this.

    Say I've got a model like this:

    class AlbumSchema(Schema):
        year = fields.Int()
    
    class ArtistSchema(Schema):
        name = fields.Str()
        albums = ...
    

    I want albums to be a dict of AlbumSchema, so that ArtistSchema serializes as

    { 'albums': { 'Hunky Dory': {'year': 1971},
                  'The Man Who Sold the World': {'year': 1970}},
      'name': 'David Bowie'}
    

    Naively, I would expect syntaxes like this to work:

    fields.List(Schema)
    fields.Dict(Schema)
    

    or maybe

    fields.List(fields.Nested(Schema))
    fields.Dict(fields.Nested(Schema))
    

    Serializing a list of Schema can be achieved through Nested(Schema, many=True), which I find less intuitive, and I don't know about a dict of Schema.

    Is there any way to do it? Or a good reason not to do it?

    (Question also asked on SO.)

    help wanted needs review 
    opened by lafrech 30
  • Support Enum for Select-field

    Support Enum for Select-field

    I have got some use cases where I use a Enum in my models.

    Example:

    # definitions.py
    from enum import Enum
    
    class Gender(Enum):
        male = 'm'
        female = 'f'
    
    # models.py
    from definitions import Gender
    
    class Person:
        def __init__(self, gender: Gender):
            self.gender = gender
    

    Now it would be great if I could use the Enum in marshmallow.fields.Select:

    # schemas.py
    from marshmallow import fields, Schema
    from definitions import Gender
    from models import Person
    
    class PersonSchema(Schema):
        gender = fields.Select(Gender)
    
        @staticmethod
        def make_object(data) -> Person:
            return Person(**data)
    

    For backwards-compatibility enum34 could be used.

    opened by floqqi 27
  • Question: Can I generate a dictionary of expected fields for documentation?

    Question: Can I generate a dictionary of expected fields for documentation?

    I'm using flask-restx for my rest-api creation, and in there, you can specify a dictionary of expected parameters and their specifications (type, required, etc.). I was wondering if there's either a function that can do exactly that, or if there's something similar that I can hack my way into making it possible.

    question 
    opened by Elysium1436 1
  • Meta Decorator

    Meta Decorator

    Motivation

    I often have lots of small schemas containing unique Meta classes with only a few attributes. The Meta class sometimes requires more lines than the schema definition and makes a file containing multiple schemas harder to read. This is especially true when using marshmallow-jsonapi due to type_.

    Proposal

    Provide a meta decorator that will inject its kwargs into a Meta class for the class it is wrapping.

    Example

    Before:

    class TestSchema(Schema):
        foo = fields.String()
        bar = fields.String()
    
        class Meta:
            type_ = 'tests'
            ordered = True
    

    After:

    @meta(type_='tests', ordered=True)
    class TestSchema(Schema):
        foo = fields.String()
        bar = fields.String()
    
    enhancement 
    opened by deckar01 3
  • how to handle incoming json object if one of the fields can't be a valid attribute

    how to handle incoming json object if one of the fields can't be a valid attribute

    In the json I receive there is @timestamp.

    "actions": [
        {
          "@timestamp": "2021-09-10T13:59:01.500",
          "timestamp": "2021-09-10T13:59:01.500",
    

    how to properly process that? I think there must be some way to map it to the different attribute say "_timestamp"? But data_key, load_from, dump_to does not help

    question 
    opened by shalakhin 2
  • Issues deserializing to objects

    Issues deserializing to objects

    I'm having a couple of issues with what feels like a fairly vanilla use case: deserializing from a JSON string to one or more Python objects, possibly with custom fields. The version I'm using is 3.13.0. Not sure if I'm doing something wrong or if there are any bugs present.

    1. When using @post_load, the data argument contains strings that have not been processed by their respective fields' _deserialize methods. If I remove the @post_load-decorated method, the strings in the resulting dictionary are correctly processed.

    Example (stripped of everything unnecessary):

    class Dollar_string(fields.Field):
        def _serialize(self, value, attr, obj, **kwargs):
            return f"${value}"
    
        def _deserialize(self, value, attr, obj, **kwargs):
                # Putting a print statement here reveals that this method does not get called. In the absence
                # of My_schema.load, that is no longer true.
                return Decimal(value[1:])
    
    
    class My_schema(Schema):
        number_of_dollars = Dollar_string(required=True)
    
        @post_load
        def load(self, data, **kwargs):
            return My_object.init(**data)
    
    
    @attr.s
    class My_object:
        number_of_dollars = attr.ib()
    
        @staticmethod
        def init(*, number_of_dollars):
            # Here number_of_dollars is the raw string given, ie "$10" including the dollar sign.
            return My_object(number_of_dollars)
    
    
        @staticmethod
        def load_file(file):
            with open(file, 'r') as f:
                schema = My_schema()
                return schema.loads(f.read())
    
    1. Passing many=True to a schema constructor appears to have no effect. In the following example I have a file with a JSON list of objects corresponding to the provided schema. In the presence of many=True I would have expected schema.loads() to apply the schema to each element of the list and give me back a list of objects. (I've also tried passing many=True to schema.loads and that doesn't change anything.)

    Example:

    @attr.s
    class My_object:
        my_field = attr.ib()
    
        @staticmethod
        def init(*, my_field):
            return My_object(my_field)
    
    
    class My_schema(Schema):
        my_field = fields.String()
    
        @post_load
        def load(self, data, **kwargs):
            # Here I get data as a list, resulting in the comprehension below. I would have expected that
            # this method would instead get a dict representing a single object. As a result I'm not able
            # to reuse this schema if I have another use case where I want to deserialize a single object.
            return [My_object.init(**inner) for inner in data]
    
    
    def load_file(file):
        with open(file, 'r') as fd:
            schema = My_schema(many=True)
            return schema.loads(fd.read())
    
    question 
    opened by danben 1
  • When using @validates_schema and raising marshmallow.Validationerror only load() work, not validate()

    When using @validates_schema and raising marshmallow.Validationerror only load() work, not validate()

    This is my validator

    class NewApplicationValidator(Schema):
        posting_id = fields.String()
        email = fields.Email()
        cover_letter = fields.String()
        cv_link = fields.Url()
    
        # Check so the current user haven't applied to this posting already
        @validates_schema
        def validate_unique_application(self, data, **kwargs):
            
            # Get current user
            applicant = User.objects.filter(nfkc_email=data["email"])
    
            # Get posting
            posting = Posting.objects.filter(pk=data["posting_id"])
    
            # Get any applications for this user and this posting 
            application = Application.objects.filter(
            applicant=applicant
            ).filter(
                posting=posting
                )
    
            # If any application found for this posting with this user raise Validationerror
            if application is not None:
                 raise ValidationError("An application for this posting has already been made by this user!")
            
            # If no application was found just log succeeded validation
            logging.info("From validators.py: Ok to create new application, no previous application found")
    

    If I then run:

            try: 
                NewApplicationValidator().load(context)
    except marshmallow.exceptions.ValidationError as e:
                return error_utils.get_validation_error_response(validation_error=e, http_status_code=422)
    

    The exception will be picked up.

    However, If I would have replaced "load" with "validate" the error would not be picked up.

    question 
    opened by minifisk 1
  • Fix type-hints for `data` arg in `Schema.validate`

    Fix type-hints for `data` arg in `Schema.validate`

    fixes #1790

    Checklist:

    • [x] pre-commit tests passed
    • [x] CI passed
    • [x] Update CHANGELOG.rst
    • [x] Add myself to AUTHORS.rst
    opened by Yourun-proger 0
  • Fix: TimeDelta Precision Errors

    Fix: TimeDelta Precision Errors

    Use microsecond integer arithmetic to fix high precision timedelta errors.

    Fixes #1865

    opened by deckar01 0
  • TimeDelta serialization precision

    TimeDelta serialization precision

    Hi there!

    I just found quite strange behaviour of TimeDelta field serialization

    from marshmallow.fields import TimeDelta
    from datetime import timedelta
    
    td_field = TimeDelta(precision="milliseconds")
    
    obj = dict()
    obj["td_field"] = timedelta(milliseconds=345)
    
    print(td_field.serialize("td_field", obj))
    

    Output of this snippet is 344, but it seems that 345 is correct.

    Looks like a rounding issue here: https://github.com/marshmallow-code/marshmallow/blob/dev/src/marshmallow/fields.py#L1474

    opened by yarsanich 4
  • Always set data_key

    Always set data_key

    Hello all,

    in my application we use and love marshmallow. We have a lot of dynamic coding, therefore we access fields e.g. over the declared_fields attribute. Also we need most of the time the Field.data_key attribute which is defined as str | None.

    I would like to suggest an option, which always sets the data_key so that it will never be None. If the user provides it, fine. Else, just set it to the field name. I guess this could also simplify some coding, since you can expect that this attribute is always set.

    What do you think?

    opened by kasium 2
  • Schema.dump_only returns empty set if fields explicitly declared dump_only

    Schema.dump_only returns empty set if fields explicitly declared dump_only

    Ran across this behavior when trying to determine the dump_only fields in a given Schema. If the Schema defines dump_only fields via the Meta class approach, it works as expected, but if I specify dump_only when adding the fields, Schema.dump_only returns an empty set.

    An easy workaround is to use Schema.dump_fields.items() - Schema.load_fields.items() so this isn't a show-stopper for me, but it was counter-intuitive when I saw that Schema had a dump_only attribute and expected it to include those.

    class MySchema(m.Schema):
        id = m.fields.Integer(dump_only=True)
        label = m.fields.String(missing="(none)")
        
    instance = MySchema()
    assert(instance.dump_only == set()) # this should be {"id"}, no???
    
    # declaring dump_only explicitly as Meta:
    class MyOtherSchema(m.Schema):
        id = m.fields.Integer()
        label = m.fields.String(missing="(none)")
        class Meta:
            dump_only=("id", )
    
    other_instance = MyOtherSchema()
    assert(other_instance.dump_only == {"id"})```
    opened by RookieRick 0
Owner
marshmallow-code
Python object serialization and deserialization, lightweight and fluffy
marshmallow-code
serialize all of python

dill serialize all of python About Dill dill extends python's pickle module for serializing and de-serializing python objects to the majority of the b

The UQ Foundation 1.5k Sep 23, 2021
Python library for serializing any arbitrary object graph into JSON. It can take almost any Python object and turn the object into JSON. Additionally, it can reconstitute the object back into Python.

jsonpickle jsonpickle is a library for the two-way conversion of complex Python objects and JSON. jsonpickle builds upon the existing JSON encoders, s

null 962 Sep 16, 2021
Python bindings for the simdjson project.

pysimdjson Python bindings for the simdjson project, a SIMD-accelerated JSON parser. If SIMD instructions are unavailable a fallback parser is used, m

Tyler Kennedy 474 Sep 24, 2021
Generic ASN.1 library for Python

ASN.1 library for Python This is a free and open source implementation of ASN.1 types and codecs as a Python package. It has been first written to sup

Ilya Etingof 187 Sep 20, 2021
Ultra fast JSON decoder and encoder written in C with Python bindings

UltraJSON UltraJSON is an ultra fast JSON encoder and decoder written in pure C with bindings for Python 3.6+. Install with pip: $ python -m pip insta

null 3.4k Sep 24, 2021
simplejson is a simple, fast, extensible JSON encoder/decoder for Python

simplejson simplejson is a simple, fast, complete, correct and extensible JSON <http://json.org> encoder and decoder for Python 3.3+ with legacy suppo

null 1.4k Sep 24, 2021
Extended pickling support for Python objects

cloudpickle cloudpickle makes it possible to serialize Python constructs not supported by the default pickle module from the Python standard library.

null 1k Sep 23, 2021
MessagePack serializer implementation for Python msgpack.org[Python]

MessagePack for Python What's this MessagePack is an efficient binary serialization format. It lets you exchange data among multiple languages like JS

MessagePack 1.5k Sep 22, 2021
FlatBuffers: Memory Efficient Serialization Library

FlatBuffers FlatBuffers is a cross platform serialization library architected for maximum memory efficiency. It allows you to directly access serializ

Google 16.8k Sep 23, 2021
🦉 Modern high-performance serialization utilities for Python (JSON, MessagePack, Pickle)

srsly: Modern high-performance serialization utilities for Python This package bundles some of the best Python serialization libraries into one standa

Explosion 230 Sep 9, 2021
Python wrapper around rapidjson

python-rapidjson Python wrapper around RapidJSON Authors: Ken Robbins <[email protected]> Lele Gaifax <[email protected]> License: MIT License Sta

null 429 Sep 17, 2021
Protocol Buffers - Google's data interchange format

Protocol Buffers - Google's data interchange format Copyright 2008 Google Inc. https://developers.google.com/protocol-buffers/ Overview Protocol Buffe

Protocol Buffers 50.8k Sep 23, 2021
Crappy tool to convert .scw files to .json and and vice versa.

SCW-JSON-TOOL Crappy tool to convert .scw files to .json and vice versa. How to use Run main.py file with two arguments: python main.py <scw2json or j

Fred31 5 May 14, 2021
Corset is a web-based data selection portal that helps you getting relevant data from massive amounts of parallel data.

Corset is a web-based data selection portal that helps you getting relevant data from massive amounts of parallel data. So, if you don't need the whole corpus, but just a suitable subset (indeed, a cor(pus sub)set, this is what Corset will do for you--and the reason of the name of the tool.

null 5 Jun 15, 2021