sync/async MongoDB ODM, yes.

Overview

μMongo: sync/async ODM

Latest version Python versions marshmallow 3 only License Build status Documentation

μMongo is a Python MongoDB ODM. It inception comes from two needs: the lack of async ODM and the difficulty to do document (un)serialization with existing ODMs.

From this point, μMongo made a few design choices:

  • Stay close to the standards MongoDB driver to keep the same API when possible: use find({"field": "value"}) like usual but retrieve your data nicely OO wrapped !
  • Work with multiple drivers (PyMongo, TxMongo, motor_asyncio and mongomock for the moment)
  • Tight integration with Marshmallow serialization library to easily dump and load your data with the outside world
  • i18n integration to localize validation error messages
  • Free software: MIT license
  • Test with 90%+ coverage ;-)

µMongo requires MongoDB 4.2+ and Python 3.7+.

Quick example

import datetime as dt
from pymongo import MongoClient
from umongo import Document, fields, validate
from umongo.frameworks import PyMongoInstance

db = MongoClient().test
instance = PyMongoInstance(db)

@instance.register
class User(Document):
    email = fields.EmailField(required=True, unique=True)
    birthday = fields.DateTimeField(validate=validate.Range(min=dt.datetime(1900, 1, 1)))
    friends = fields.ListField(fields.ReferenceField("User"))

    class Meta:
        collection_name = "user"

# Make sure that unique indexes are created
User.ensure_indexes()

goku = User(email='[email protected]', birthday=dt.datetime(1984, 11, 20))
goku.commit()
vegeta = User(email='[email protected]', friends=[goku])
vegeta.commit()

vegeta.friends
# <object umongo.data_objects.List([<object umongo.dal.pymongo.PyMongoReference(document=User, pk=ObjectId('5717568613adf27be6363f78'))>])>
vegeta.dump()
# {id': '570ddb311d41c89cabceeddc', 'email': '[email protected]', friends': ['570ddb2a1d41c89cabceeddb']}
User.find_one({"email": '[email protected]'})
# <object Document __main__.User({'id': ObjectId('570ddb2a1d41c89cabceeddb'), 'friends': <object umongo.data_objects.List([])>,
#                                 'email': '[email protected]', 'birthday': datetime.datetime(1984, 11, 20, 0, 0)})>

Get it now:

$ pip install umongo           # This installs umongo with pymongo
$ pip install my-mongo-driver  # Other MongoDB drivers must be installed manually

Or to get it along with the MongoDB driver you're planing to use:

$ pip install umongo[motor]
$ pip install umongo[txmongo]
$ pip install umongo[mongomock]
Comments
  • Marshmallow 3 compatibility

    Marshmallow 3 compatibility

    Closes https://github.com/Scille/umongo/issues/129.

    The failing test is due to an issue in marshmallow that was fixed since 3.0.0rc1 was released. It'll be fixed with next marshmallow release.

    opened by lafrech 20
  • Add count_documents support to motor asyncio Documents

    Add count_documents support to motor asyncio Documents

    Update API to support v2.0.0 of motor. This involves changing tests that use the deprecated count() cursor method to use either count_documents() or to_list() instead, and removing the callback keyword argument from the to_list wrapper.

    opened by wfscheper 17
  • Fix allow_none fields [de|]serialization

    Fix allow_none fields [de|]serialization

    I'm afraid there's an issue with allow_none fields.

    For both serialization and deserialization to/from Mongo, if the value is None, None should be returned and all _(de)serialize_from_mongo overridden methods should be skipped, as they don't deal with the allow_none case and either break (e.g. StrictDateTime) or return a nullish object (empty list, empty dict).

    See attached fix proposal.

    I didn't code any test as I'm in a bit of a rush and I'd rather have your feedback first.

    bug 
    opened by lafrech 17
  • Pass as marshmallow schema params to nested schemas

    Pass as marshmallow schema params to nested schemas

    Parameters of as_marshmallow_schema should be passed to nested Schemas.

    In this implementation, I pass them to all fields, so that EmbeddedField gets it and pass it when calling as_marshmallow_schema.

    Nothing really complicated here. I had to modify as_marshmallow_field's signature, hence the modifications in unrelated fields.

    Note that I changed kwargs into field_kwargs in those fields to make it obvious to the reader it is not the same kwargs we're talking about. It would work without the rename, but it would look like a mistake at first sight:

        def as_marshmallow_field(self, params=None, mongo_world=False, **kwargs):
            # Oh, we're overwriting kwargs! Let's file a bug!
            kwargs = self._extract_marshmallow_field_params(mongo_world)
    

    Another option was to use _ but I'm not sure it is so common for kwargs:

        def as_marshmallow_field(self, params=None, mongo_world=False, **_):
            kwargs = self._extract_marshmallow_field_params(mongo_world)
    

    I preferred to rename the field kwargs inside the method:

        def as_marshmallow_field(self, params=None, mongo_world=False, **kwargs):
            field_kwargs = self._extract_marshmallow_field_params(mongo_world)
    

    I could rework this if you have a better idea.

    This PR does not address those two related issues I assigned myself a while ago: https://github.com/Scille/umongo/issues/64, https://github.com/Scille/umongo/issues/65...

    opened by lafrech 16
  • Raise exception with Document name when unknown field found in DB

    Raise exception with Document name when unknown field found in DB

    Currently, when an unknown field is found in database using a strict DataProxy, the KeyError is not caught. When dealing with deeply nested documents, it can be a bit cumbersome to track the faulty EmbeddedDocument.

    This PR creates a dedicated exception that prints the name of the data proxy to ease the debug procedure.

    enhancement 
    opened by lafrech 14
  • Add reset_missings parameter to DocumentImplementation's update method

    Add reset_missings parameter to DocumentImplementation's update method

    Implementation proposal for the reset_missings feature.

    Removes value from document if loadable in the Schema but missing in input data.

    user = User(name='Hannibal', team="A-Team")
    user.update({'name': 'Looping'})
    assert user.team == "A-Team"
    user.update({'name': 'Chuck Norris'}, reset_missings=True)
    assert user.team == None
    
    opened by lafrech 14
  • Fix embedded document inheritance

    Fix embedded document inheritance

    I'm afraid there's an issue with embedded document inheritance when deserializing from DB: the _cls attribute is ignored. Looks like it only works when loading from outside world, not from DB.

    This dirty fix seems to do the trick in my use case, but it is certainly not the right implementation. It is mostly a copy-paste from _deserialize to _deserialize_from_mongo. It should at least be factorized.

    I'm not sure the instance should be created using to_use_cls(**value). It didn't work for me (complained about an unknown field name due to a dump_only field), so I randomly tried with build_from_mongo and it happened to work...

    I think I had this case covered in my PR since I overrode build_from_mongo in EmbeddedDocumentImplementation to pick the right class. But I'm not sure this is relevant to your implementation. I basically copied inheritance related stuff from Document to EmbeddedDocument, while you addressed inheritance in EmbeddedField, AFAIU.

    BTW, isn't there a way in EmbeddedField to get embedded_document_cls at init time (when declaring the field in the schema)?

    bug 
    opened by lafrech 13
  • Rethink the missing/default field attributes

    Rethink the missing/default field attributes

    Those attributes come from marshmallow, which is all about serialize/deserialize a document:

    • missing is used when the field is not present in the dict to deserialize
    • default is used when the field is not present in the document to serialize

    However in umongo the focus shifts toward the document itself, so those terms seems a bit clunky

    • the missing object is already used inside umongo's DataProxy to represent field not present in MongoDB. Hence the missing attribute should mean what value to return if the value is missing
    • the default attribute sounds like the value to set by default to the field when creating a new document

    In a nutshell the meaning of missing and default are reversed between mashmallow and umongo...

    What I'm thinking to do:

    • Remove the missing attribute (in fact just hide it from abstract.BaseField constructor and documentation)
    • Only use the default attribute for both missing and default (in marshmallow's logic)
    • Add methods to get from the umongo document an equivalent pure mashmallow Schema (see #34 )

    The idea is to hide mashmallow logic to expose a more consistent API from the umongo user's point of view. Then provide a way to get back a pure Marshmallow Schema when needed to do all the custom.

    example:

    @instance.register
    class Person(Document):
        name = fields.StrField(default='John Doe')
    
    p = Person()
    # Default is set
    assert p.name == 'John Doe'
    # Default value will be written in database
    assert p._data.get('name') == 'John Doe'
    # If we want more cunning behavior (e.g. only save in database non-default value)
    # we should use a `MethodField` to define custom method for serialization/deseriazation
    
    del p.name
    assert p._data.get('name') == missing
    # If not present on database, we switch back to default value as well
    assert p.name == 'John Doe'
    
    # Now it's customization time !
    PersonSchema = p.get_marshmallow_schema()
    class MyCustomSchema(PersonSchema):
        ...
    
    # It could be also useful to provide method to get a specific field
    class MyCustomSchema2(marshmallow.Schema)
        name = p.get_marshmallow_field('name')
    

    @lafrech What do you think ?

    enhancement 
    opened by touilleMan 13
  • Allow exportation of the Marshmallow Schema of a Document

    Allow exportation of the Marshmallow Schema of a Document

    Currently, the Schema of a Document can be obtained from Document.Schema. However, this Schema is made to [|de]serialize between "DB / data_proxy" and "OO world", not between "OO world" and "client/JSON world". (See diagram in the docs).

    In other words, uMongo is made to be used like this:

        document.dump(schema=schema)
        # which is equivalent to
        schema.dump(document._data._data)
        # but not equivalent to
        schema.dump(document)
    

    The difference being that the data_proxy does not behave like the document:

    • It may have keys named differently is "attribute" is used
    • document returns None if a key is missing
    • ...

    Therefore, using the Schema to serialize a Document may work but it currently has corner cases.

    @touilleMan confirms that the ability to export a Marshmallow Schema without the uMongo specificities is in the scope of uMongo and is a feature we want to have.

    The idea could be to add a method or attribute to the document to provide that "cleaned up" Schema.

    I'm opening this issue to centralize the reflexions about that.

    Currently, here are the issues I found:

    • [ ] check_unknown_fields raises ValidationError if passed a dump_only field even if value is missing, which is an issue when validating a document before deserialization. (Bug report: https://github.com/Scille/umongo/issues/18, PR: https://github.com/Scille/umongo/pull/19)
    • [x] Some uMongo fields fail if value is None. Maybe this one is unrelated but just happened to occur while calling `schema.dump(document) (PR: https://github.com/Scille/umongo/pull/32)
    • [ ] Fields with "attribute" not being None will fail because the Schema tries to find the value in "attribute", while in the document, it is available at the field's name. To avoid this, we could set "attribute" to None in all the fields. (PR: https://github.com/Scille/umongo/pull/33)

    https://github.com/Scille/umongo/pull/33 drafts a way of exporting the Marshmallow Schema from the document.

    enhancement 
    opened by lafrech 13
  • Is there support for OrderedDict

    Is there support for OrderedDict

    If you like to find embedded documents the order of the kv pairs in the embedded document is essential as 'find' will only match documents with the same order. As the mongo docs say (https://docs.mongodb.com/manual/tutorial/query-documents/#exact-match-on-the-embedded-document):

    Equality matches on an embedded document require an exact match of the specified , including the field order.

    pymongo thus supports python OrderedDict: http://stackoverflow.com/a/30787769/4273834

    as does marshmallow: http://marshmallow.readthedocs.io/en/latest/quickstart.html#ordering-output

    Is there a way to use OrderedDict in umongo?

    enhancement 
    opened by pumelo 11
  • DateTime and timezone awareness

    DateTime and timezone awareness

    I'm pulling my hair with datetime TZ awareness issues.

    I initiate the connection to MongoDB with tz_aware = False. I'm no expert about this, and I had never thought much about it before now, but from what I gathered, it seems like a reasonable choice to make. Besides, it's pymongo's default (but flask-PyMongo's hardcodes tz_aware to True).

    When a document is pulled from the DB, its DateTimeField attribute's TZ awareness depends only on MongoClient's tz_aware parameter (no Marshmallow schema involved):

    tz_aware=True -> pymongo provides an aware datetime -> umongo returns an aware datetime tz_aware=False -> pymongo provides a naive datetime -> umongo returns a naive datetime

    This is one more reason to set tz_aware = False, because I'm passing a document/object to a lib that expects a naive timezone. (OAuth2 lib expects expires timestamp to be naive and compares it to datetime.utcnow() (see code).) I suppose I could alternatively modify my getters to make all datetimes naive before returning tokens/grants, but on some use cases, it could get cumbersome.

    Marshmallow, however, returns every datetime as TZ aware (doc). For this reason, umongo's DateTimeField's _deserialize method returns a TZ aware datetime. Since I'm using it (via webargs) to parse the inputs to my API, I'm getting TZ aware datetimes. Likewise, calling load() on a document will use _deserialize and result in a TZ aware datetime.

    So if I load a date, it becomes TZ aware. Therefore, to compare it to a date from the database, this one needs to be aware as well.

    Should I understand that umongo is meant to be used with tz_aware=True, so that dates fetched from the database can be compared to dates loaded thought Marshmallow schemas?

    Could there be a flag/meta allowing to specify if a DateTimeField should return a naive datetime?

    I made a quick and dirty patch to DateTime's _deserialize to remove the TZ from the returned output.

        dt = super()._deserialize(value, attr, data)
        return dt.replace(tzinfo=None)
    

    This seems to work on my use case. However, the day we complete pure Marshmallow schema export, the exported schema I pass to webargs for API input parsing won't have that feature. Unless this is added to Marshmallow as well. (I asked there about it there: https://github.com/marshmallow-code/marshmallow/issues/520.)

    Feedback welcome. Those TZ issues are new to me, so I may be totally misguided.

    opened by lafrech 11
  • Warning: The 'missing' attribute of fields is deprecated

    Warning: The 'missing' attribute of fields is deprecated

    When I run unit tests involving umongo, I get the following warning message (actually, if I run the tests via pytest instead of unittest, I get hundreds of the same warning):

    /Users/tiltowait/Library/Caches/pypoetry/virtualenvs/inconnu-qKm_l4La-py3.10/lib/python3.10/site-packages/umongo/data_proxy.py:160: RemovedInMarshmallow4Warning: The 'missing' attribute of fields is deprecated. Use 'load_default' instead.
       if callable(field.missing):
     /Users/tiltowait/Library/Caches/pypoetry/virtualenvs/inconnu-qKm_l4La-py3.10/lib/python3.10/site-packages/umongo/data_proxy.py:163: RemovedInMarshmallow4Warning: The 'missing' attribute of fields is deprecated. Use 'load_default' instead.
       self._data[mongo_name] = field.missing
     /Users/tiltowait/Library/Caches/pypoetry/virtualenvs/inconnu-qKm_l4La-py3.10/lib/python3.10/site-packages/umongo/data_proxy.py:161: RemovedInMarshmallow4Warning: The 'missing' attribute of fields is deprecated. Use 'load_default' instead.
       self._data[mongo_name] = field.missing()
    

    Is there any plan to address this? It looks like a simple fix, but I'm not familiar with umongo's code to be confident in that assessment.

    opened by tiltowait 1
  • [RFC] Drop txmongo support?

    [RFC] Drop txmongo support?

    It seems txmongo is increasingly hard to support.

    txmongo cannot be imported with pymongo >= 4 (see https://github.com/twisted/txmongo/issues/278)

    Even if the import issues are fixed, txmongo internally uses an OP_QUERY command which is incompatible with Mongo server instances >= 6.0 (see https://www.mongodb.com/docs/manual/reference/mongodb-wire-protocol/#footnote-op-query-footnote).

    txmongo is not actively developed.

    Should txmongo support remain, or just be deprecated?

    opened by whophil 1
  • Referencing GridFS data?

    Referencing GridFS data?

    I would like to reference binary blobs larger than 16 MB in size in my uMongo documents.

    Seems like it should be possible to implement a sort of GridFS reference as a type of field. Has anybody tried this, or put any thought to how it might be done?

    Relates to https://github.com/Scille/umongo/issues/37

    opened by whophil 0
  • DictField with an EmbeddedDocument value produces error

    DictField with an EmbeddedDocument value produces error

    I am trying to set an EmbeddedDocument as a value in a DictField, which according to #99 should work. However, when I do so, I receive the following error:

    bson.errors.InvalidDocument: cannot encode object: <object EmbeddedDocument objects.items.MockItem({'uuid': UUID('cee1f505-1014-4bc8-9095-7047c869587c'), '_token_id': '1583', 'durability': 100})>, of type: <Implementation class 'objects.items.MockItem'>
    

    The DictField's definition:

        _slots = fields.DictField(attribute="slots")
    

    The MockItem EmbeddedDocument:

    instance.register
    class MockItem(EmbeddedDocument):
        """An item with associated durability."""
    
        uuid = fields.UUIDField(default=uuid.uuid4)
        _token_id = fields.StrField(required=True, attribute="tokenId")
        durability = fields.IntField(default=100)
        _item = None
    
        # Custom __getattr__ and __hash__ omitted
    

    Stepping through the debugger, prior to attempting to commit(), the _slots dict has the following value:

    <object umongo.data_objects.Dict({'base': None, 'pet': None, 'cloak': None, 'oh': None, 'body': None, 'hair': None, 'ears': None, 'face': None, 'legs': None, 'feet': None, 'chest': None, 'hands': None, 'waist': None, 'head': None, 'mh': <object EmbeddedDocument objects.items.MockItem({'uuid': UUID('d28a07c1-49b3-4099-be44-4733d04420cd'), '_token_id': '-104', 'durability': 100})>})>
    

    In my AsyncioMotorClient, I am setting uuidRepresentation="standard", but I get the error with or without the UUID field present. What am I doing wrong?

    opened by tiltowait 0
  • ConstantField produces error when converting to marshmallow schema

    ConstantField produces error when converting to marshmallow schema

    as_marshmallow_schema returned the following message for the ConstantField attribute:

    @instance.register
    class Dog(Document):
        breed = fields.ConstantField("Mongrel")
    
    DogMaSchema = Dog.schema.as_marshmallow_schema()
    
    Traceback (most recent call last):
      File "...\AppData\Local\Programs\Python\Python38\lib\code.py", line 90, in runcode
        exec(code, self.locals)
      File "<input>", line 5, in <module>
      File "...\.venv\lib\site-packages\umongo\abstract.py", line 66, in as_marshmallow_schema
        nmspc = {
      File "...\.venv\lib\site-packages\umongo\abstract.py", line 67, in <dictcomp>
        name: field.as_marshmallow_field()
      File "...\.venv\lib\site-packages\umongo\abstract.py", line 209, in as_marshmallow_field
        m_field = m_class(**field_kwargs, metadata=self.metadata)
    TypeError: __init__() missing 1 required positional argument: 'constant'
    
    
    opened by PGShifter 0
  • Insert many documents at once

    Insert many documents at once

    All the examples show we can add the data entry to database one by one. But it is possible to use a method similar to insertMany to add many docs at once?

    opened by getjake 0
Owner
Scille
Scille
MongoX is an async python ODM for MongoDB which is built on top Motor and Pydantic.

MongoX MongoX is an async python ODM (Object Document Mapper) for MongoDB which is built on top Motor and Pydantic. The main features include: Fully t

Amin Alaee 112 Dec 4, 2022
Micro ODM for MongoDB

Beanie - is an asynchronous ODM for MongoDB, based on Motor and Pydantic. It uses an abstraction over Pydantic models and Motor collections to work wi

Roman 993 Jan 3, 2023
MongoDB data stream pipeline tools by YouGov (adopted from MongoDB)

mongo-connector The mongo-connector project originated as a MongoDB mongo-labs project and is now community-maintained under the custody of YouGov, Pl

YouGov 1.9k Jan 4, 2023
Motor - the async Python driver for MongoDB and Tornado or asyncio

Motor Info: Motor is a full-featured, non-blocking MongoDB driver for Python Tornado and asyncio applications. Documentation: Available at motor.readt

mongodb 2.1k Dec 26, 2022
Motor - the async Python driver for MongoDB and Tornado or asyncio

Motor Info: Motor is a full-featured, non-blocking MongoDB driver for Python Tornado and asyncio applications. Documentation: Available at motor.readt

mongodb 1.6k Feb 6, 2021
PyMongo - the Python driver for MongoDB

PyMongo Info: See the mongo site for more information. See GitHub for the latest source. Documentation: Available at pymongo.readthedocs.io Author: Mi

mongodb 3.7k Jan 8, 2023
A Python Object-Document-Mapper for working with MongoDB

MongoEngine Info: MongoEngine is an ORM-like layer on top of PyMongo. Repository: https://github.com/MongoEngine/mongoengine Author: Harry Marr (http:

MongoEngine 3.9k Jan 8, 2023
A Pythonic, object-oriented interface for working with MongoDB.

PyMODM MongoDB has paused the development of PyMODM. If there are any users who want to take over and maintain this project, or if you just have quest

mongodb 345 Dec 25, 2022
A simple wrapper to make a flat file drop in raplacement for mongodb out of TinyDB

Purpose A simple wrapper to make a drop in replacement for mongodb out of tinydb. This module is an attempt to add an interface familiar to those curr

null 180 Jan 1, 2023
Monty, Mongo tinified. MongoDB implemented in Python !

Monty, Mongo tinified. MongoDB implemented in Python ! Inspired by TinyDB and it's extension TinyMongo. MontyDB is: A tiny version of MongoDB, against

David Lai 522 Jan 1, 2023
A simple password manager I typed with python using MongoDB .

Python with MongoDB A simple python code example using MongoDB. How do i run this code • First of all you need to have a python on your computer. If y

null 31 Dec 6, 2022
Query multiple mongoDB database collections easily

leakscoop Perform queries across multiple MongoDB databases and collections, where the field names and the field content structure in each database ma

bagel 5 Jun 24, 2021
Implementing basic MongoDB CRUD (Create, Read, Update, Delete) queries, using Python.

MongoDB with Python Implementing basic MongoDB CRUD (Create, Read, Update, Delete) queries, using Python. We can connect to a MongoDB database hosted

MousamSingh 4 Dec 1, 2021
A CRUD and REST api with mongodb atlas.

Movies_api A CRUD and REST api with mongodb atlas. Setup First import all the python dependencies in your virtual environment or globally by the follo

Pratyush Kongalla 0 Nov 9, 2022
Async database support for Python. 🗄

Databases Databases gives you simple asyncio support for a range of databases. It allows you to make queries using the powerful SQLAlchemy Core expres

Encode 3.2k Dec 30, 2022
Async ORM based on PyPika

PyPika-ORM - ORM for PyPika SQL Query Builder The package gives you ORM for PyPika with asycio support for a range of databases (SQLite, PostgreSQL, M

Kirill Klenov 7 Jun 4, 2022
Async ODM (Object Document Mapper) for MongoDB based on python type hints

ODMantic Documentation: https://art049.github.io/odmantic/ Asynchronous ODM(Object Document Mapper) for MongoDB based on standard python type hints. I

Arthur Pastel 732 Dec 31, 2022
MongoX is an async python ODM for MongoDB which is built on top Motor and Pydantic.

MongoX MongoX is an async python ODM (Object Document Mapper) for MongoDB which is built on top Motor and Pydantic. The main features include: Fully t

Amin Alaee 112 Dec 4, 2022
Backend, modern REST API for obtaining match and odds data crawled from multiple sites. Using FastAPI, MongoDB as database, Motor as async MongoDB client, Scrapy as crawler and Docker.

Introduction Apiestas is a project composed of a backend powered by the awesome framework FastAPI and a crawler powered by Scrapy. This project has fo

Fran Lozano 54 Dec 13, 2022
Micro ODM for MongoDB

Beanie - is an asynchronous ODM for MongoDB, based on Motor and Pydantic. It uses an abstraction over Pydantic models and Motor collections to work wi

Roman 993 Jan 3, 2023