A pythonic interface to Amazon's DynamoDB

Related tags

python aws dynamodb
Overview

PynamoDB

A Pythonic interface for Amazon's DynamoDB.

DynamoDB is a great NoSQL service provided by Amazon, but the API is verbose. PynamoDB presents you with a simple, elegant API.

Useful links:

Installation

From PyPi:

$ pip install pynamodb

From GitHub:

$ pip install git+https://github.com/pynamodb/PynamoDB#egg=pynamodb

From conda-forge:

$ conda install -c conda-forge pynamodb

Upgrading

Warning

The behavior of 'UnicodeSetAttribute' has changed in backwards-incompatible ways as of the 1.6.0 and 3.0.1 releases of PynamoDB.

The following steps can be used to safely update PynamoDB assuming that the data stored in the item's UnicodeSetAttribute is not JSON. If JSON is being stored, these steps will not work and a custom migration plan is required. Be aware that values such as numeric strings (i.e. "123") are valid JSON.

When upgrading services that use PynamoDB with tables that contain UnicodeSetAttributes with a version < 1.6.0, first deploy version 1.5.4 to prepare the read path for the new serialization format.

Once all services that read from the tables have been deployed, then deploy version 2.2.0 and migrate your data using the provided convenience methods on the Model. (Note: these methods are only available in version 2.2.0)

def get_save_kwargs(item):
    # any conditional args needed to ensure data does not get overwritten
    # for example if your item has a `version` attribute
    {'version__eq': item.version}

# Re-serialize all UnicodeSetAttributes in the table by scanning all items.
# See documentation of fix_unicode_set_attributes for rate limiting options
# to avoid exceeding provisioned capacity.
Model.fix_unicode_set_attributes(get_save_kwargs)

# Verify the migration is complete
print("Migration Complete? " + Model.needs_unicode_set_fix())

Once all data has been migrated then upgrade to a version >= 3.0.1.

Basic Usage

Create a model that describes your DynamoDB table.

from pynamodb.models import Model
from pynamodb.attributes import UnicodeAttribute

class UserModel(Model):
    """
    A DynamoDB User
    """
    class Meta:
        table_name = "dynamodb-user"
    email = UnicodeAttribute(null=True)
    first_name = UnicodeAttribute(range_key=True)
    last_name = UnicodeAttribute(hash_key=True)

PynamoDB allows you to create the table if needed (it must exist before you can use it!):

UserModel.create_table(read_capacity_units=1, write_capacity_units=1)

Create a new user:

user = UserModel("John", "Denver")
user.email = "[email protected]"
user.save()

Now, search your table for all users with a last name of 'Denver' and whose first name begins with 'J':

for user in UserModel.query("Denver", UserModel.first_name.startswith("J")):
    print(user.first_name)

Examples of ways to query your table with filter conditions:

for user in UserModel.query("Denver", UserModel.email=="[email protected]"):
    print(user.first_name)
for user in UserModel.query("Denver", UserModel.email=="[email protected]"):
    print(user.first_name)

Retrieve an existing user:

try:
    user = UserModel.get("John", "Denver")
    print(user)
except UserModel.DoesNotExist:
    print("User does not exist")

Advanced Usage

Want to use indexes? No problem:

from pynamodb.models import Model
from pynamodb.indexes import GlobalSecondaryIndex, AllProjection
from pynamodb.attributes import NumberAttribute, UnicodeAttribute

class ViewIndex(GlobalSecondaryIndex):
    class Meta:
        read_capacity_units = 2
        write_capacity_units = 1
        projection = AllProjection()
    view = NumberAttribute(default=0, hash_key=True)

class TestModel(Model):
    class Meta:
        table_name = "TestModel"
    forum = UnicodeAttribute(hash_key=True)
    thread = UnicodeAttribute(range_key=True)
    view = NumberAttribute(default=0)
    view_index = ViewIndex()

Now query the index for all items with 0 views:

for item in TestModel.view_index.query(0):
    print("Item queried from index: {0}".format(item))

It's really that simple.

Want to use DynamoDB local? Just add a host name attribute and specify your local server.

from pynamodb.models import Model
from pynamodb.attributes import UnicodeAttribute

class UserModel(Model):
    """
    A DynamoDB User
    """
    class Meta:
        table_name = "dynamodb-user"
        host = "http://localhost:8000"
    email = UnicodeAttribute(null=True)
    first_name = UnicodeAttribute(range_key=True)
    last_name = UnicodeAttribute(hash_key=True)

Want to enable streams on a table? Just add a stream_view_type name attribute and specify the type of data you'd like to stream.

from pynamodb.models import Model
from pynamodb.attributes import UnicodeAttribute
from pynamodb.constants import STREAM_NEW_AND_OLD_IMAGE

class AnimalModel(Model):
    """
    A DynamoDB Animal
    """
    class Meta:
        table_name = "dynamodb-user"
        host = "http://localhost:8000"
        stream_view_type = STREAM_NEW_AND_OLD_IMAGE
    type = UnicodeAttribute(null=True)
    name = UnicodeAttribute(range_key=True)
    id = UnicodeAttribute(hash_key=True)

Features

  • Python >= 3.6 support
  • An ORM-like interface with query and scan filters
  • Compatible with DynamoDB Local
  • Supports the entire DynamoDB API
  • Support for Unicode, Binary, JSON, Number, Set, and UTC Datetime attributes
  • Support for Global and Local Secondary Indexes
  • Provides iterators for working with queries, scans, that are automatically paginated
  • Automatic pagination for bulk operations
  • Complex queries
  • Batch operations with automatic pagination
  • Iterators for working with Query and Scan operations
Issues
  • Support for On Demand mode and Global Secondary Indexes

    Support for On Demand mode and Global Secondary Indexes

    The current code has incomplete support for the On Demand billing mode of DynamoDB.

    The notes around the launch state that the indexes inherit the On Demand mode from their tables.

    https://aws.amazon.com/blogs/aws/amazon-dynamodb-on-demand-no-capacity-planning-and-pay-per-request-pricing/

    Indexes created on a table using on-demand mode inherit the same scalability and billing model. You don’t need to specify throughput capacity settings for indexes, and you pay by their use. If you don’t have read/write traffic to a table using on-demand mode and its indexes, you only pay for the data storage.

    But the code currently throws an error AttributeError: type object 'Meta' has no attribute 'read_capacity_units' when trying to make a Global Secondary Index on a table using the On Demand mode.

    bug 
    opened by techdragon 25
  • Time To Live (TTL) support?

    Time To Live (TTL) support?

    Would be cool if we could set Time To Live (TTL) in the Meta options:

    https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/TTL.html

    feature 
    opened by Bartvds 24
  • Add boolean attributes

    Add boolean attributes

    opened by jmphilli 24
  • Serialize UnicodeSetAttributes correctly

    Serialize UnicodeSetAttributes correctly

    When writing strings to db do not json encode them.

    If they were json encoded, decode them with json. Otherwise just return them.

    opened by jmphilli 23
  • Transaction support

    Transaction support

    Heya @jlafon,

    Any plans to support transactions in future releases? https://aws.amazon.com/blogs/aws/dynamodb-transaction-library/

    feature 
    opened by sidtaduri 21
  • Multiple attrs update

    Multiple attrs update

    this allows to update multiple attributes at once using a new function Model.update (code based on the existing method Model.update_item()

    This fixes #157 and #195

    opened by yedpodtrzitko 21
  • Given key conditions were not unique

    Given key conditions were not unique

    Hello all,

    Im trying to update_or_create a document which works most of the times

    Table:

    class Report(BaseEntity):
        class Meta(BaseEntity.Meta):
            table_name = 'Reports'
    
        SomeId = UnicodeAttribute(hash_key=True)
        Time = NumberAttribute(range_key=True)  # Time as UTC Unix timestamp
        IncAttr = NumberAttribute()
    
    

    Here is the update command which sometimes failed: Report(SomeId="id", Time=get_current_time_report()).update(actions=[Report.IncAttr.add(1)])

    Somehow, in the database there are two documents with the same SomeId and Time

    Is it some case of a race condition? Anyone have any idea?

    Thanks.

    opened by amirvaza 19
  • _fast_parse_utc_date_string fails on pre year 1000

    _fast_parse_utc_date_string fails on pre year 1000

    Python datetime supports dates back to year 0, but _fast_parse_utc_date_string forces datetimes to have four digit years. This resulted in the following error:

    ValueError: Datetime string '539-02-20T08:36:49.000000+0000' does not match format '%Y-%m-%dT%H:%M:%S.%f%z'

    _fast_parse_utc_date_string should fall back to a library call if it fails.

    Edit:

    My bad, looks like datetime.datetime.strptime('539-02-20T08:36:49.000000+0000', '%Y-%m-%dT%H:%M:%S.%f%z') also fails, meaning isoformat() doesn't produce parseable datetimes. :/

    bug 
    opened by l0b0 17
  • Using Retries using headers set in settings file rather than manual retries

    Using Retries using headers set in settings file rather than manual retries

    The default headers point to envoy. Removing the earlier retries at PynamoDB layer.

    The settings can be overridden by setting the environment variable. Default max retries is 3. Will retry on 5xx, and connect-failure.

    @danielhochman @mattklein123

    opened by anandswaminathan 16
  • [Question] Using GSIs with DynamoDB local

    [Question] Using GSIs with DynamoDB local

    There is a create_table() method in models that I'm using when creating tables in DynamoDB local when running my unit tests. Can someone explain how to deal with GSIs when running local tests? Does create_table() creates GSIs also or there is a separate API to create it I can't find? Am I need to use boto3 instead to create GSI for the table instead?

    opened by SZubarev 1
  • [Question] How can I look up the name of the database table?

    [Question] How can I look up the name of the database table?

    Howdy. Sorry for the simple question, but I couldn't find it in the docs. How can I look up the name of the database the model has created? Is there a method attached to the model class where I can look up this sort of thing?

    Apologies if my searching skills are lacking. I did check the API docs, but came up empty. Thanks!

    opened by boldandbusted 1
  • TransactWrite.update does not hydrate in-memory models with updated table data

    TransactWrite.update does not hydrate in-memory models with updated table data

    Model.update passes ALL_NEW into the return values field of the request to dynamo, and then uses Model.deserialize to update local instance to reflect the modifications. We don't, however, do this for transactions, which could be a bad user experience for people expecting the same behavior

    opened by hallie 0
  • From Dict to Pynamodb Model

    From Dict to Pynamodb Model

    Hello Guys,

    I have a dict with data and want to convert to pynamodb model, how can I do it?

    Thank you

    opened by rubenssoto 0
  • Readme cleanup

    Readme cleanup

    Removed duplicate example of querying a user with filter conditions.

    opened by ishan1608 0
  • Need type annotation for ListAttribute, MapAttribute

    Need type annotation for ListAttribute, MapAttribute

    I am using python 3.9 type annotation with pynamodb installed. But, I am getting errors while running mypy incase of ListAttribute,MapAttribute How could I fix this type error?

    Thanks

    opened by mhihasan 3
  • Single Table Design with PynamoDB

    Single Table Design with PynamoDB

    Does this library support single table design for dynamodb? I've read a couple blog posts and now I'm intrigued about it.

    It's where you store items of different types in the same table, and use the hash key differentiate them. For instance in the blog post below, they have ski lift time series information for a given object stored as the instance identifier in the hash key and the various attributes defined in the sort key.

    It looks like they define getters to override what should be returned for a given attribute.

    Would it be possible to do something similar with pynamodb? What are some best practices if it's possible?

    • https://aws.amazon.com/blogs/database/amazon-dynamodb-single-table-design-using-dynamodbmapper-and-spring-boot/
    opened by HSchmale16 1
  • Getting a ValueError when deserializing a valid UTCTimeDateAttribute

    Getting a ValueError when deserializing a valid UTCTimeDateAttribute

    Hello,

    I'm getting stuck with a ValueError when i use the get() method on the following model:

    class Job(Model):
        class Meta:
            table_name = environ.get('TABLE_NAME')
            aws_access_key_id = environ.get('AWS_ACCESS_KEY_ID')
            aws_secret_access_key = environ.get('AWS_SECRET_ACCESS_KEY')
            region = 'eu-west-3'
    
        id = NumberAttribute(hash_key=True)
        name = UnicodeAttribute()
        tenant = UnicodeAttribute()
        query = UnicodeAttribute()
        schedule = UnicodeAttribute()
        active = BooleanAttribute()
        created_by = UnicodeAttribute()
        modified_by = UnicodeAttribute()
        creation_date = UTCDateTimeAttribute()
        modification_date = UTCDateTimeAttribute()
    

    the item I'm trying to retrieve has the following values:

    {
      "id": {
        "N": "1"
      },
      "tenant": {
        "S": "TRN"
      },
      "active": {
        "BOOL": true
      },
      "schedule": {
        "S": "every monday"
      },
      "query": {
        "S": "select * from *"
      },
      "created_by": {
        "S": "Lucas"
      },
      "modification_date": {
        "S": "2021-09-06 17:04:21.899683"
      },
      "modified_by": {
        "S": "Lucas"
      },
      "name": {
        "S": "Test Job 1"
      },
      "creation_date": {
        "S": "2021-09-06 17:04:21.899683"
      }
    }
    

    But i get the following stack trace:

     File "/mnt/c/Users/l.pierru/Documents/Focal Microservices/infor-data-lake-to-s3/test_scan.py", line 55, in <module>
        job = Job.get(2)
      File "/home/lpierru/.local/share/virtualenvs/infor-data-lake-to-s3-_H1zLWyn/lib/python3.9/site-packages/pynamodb/models.py", line 542, in get
        return cls.from_raw_data(item_data)
      File "/home/lpierru/.local/share/virtualenvs/infor-data-lake-to-s3-_H1zLWyn/lib/python3.9/site-packages/pynamodb/models.py", line 556, in from_raw_data
        return cls._instantiate(data)
      File "/home/lpierru/.local/share/virtualenvs/infor-data-lake-to-s3-_H1zLWyn/lib/python3.9/site-packages/pynamodb/attributes.py", line 400, in _instantiate
        AttributeContainer._container_deserialize(instance, attribute_values)
      File "/home/lpierru/.local/share/virtualenvs/infor-data-lake-to-s3-_H1zLWyn/lib/python3.9/site-packages/pynamodb/attributes.py", line 380, in _container_deserialize
        value = attr.deserialize(attr.get_value(attribute_value))
      File "/home/lpierru/.local/share/virtualenvs/infor-data-lake-to-s3-_H1zLWyn/lib/python3.9/site-packages/pynamodb/attributes.py", line 697, in deserialize
        return self._fast_parse_utc_date_string(value)
      File "/home/lpierru/.local/share/virtualenvs/infor-data-lake-to-s3-_H1zLWyn/lib/python3.9/site-packages/pynamodb/attributes.py", line 717, in _fast_parse_utc_date_string
        raise ValueError("Datetime string '{}' does not match format '{}'".format(date_string, DATETIME_FORMAT))
    ValueError: Datetime string '000002021-09-06 17:31:31.429277' does not match format '%Y-%m-%dT%H:%M:%S.%f%z`
    

    I already tried to create the item multiple times, checking for hidden characters and such. I don't know where the five zeroes before the year come from. I can't see those when using the AWS CLI or the website editor so I think it's coming from the pynamodb module.

    Any help on this would be appreciated.

    opened by mr-propre 3
  • Inconsistent behavior from UTCDateTimeAttribute when accessing via cls and self

    Inconsistent behavior from UTCDateTimeAttribute when accessing via cls and self

    Hi there, I have a couple models that utilize the UTCDateTimeAttribute attribute. My models have a couple class methods to run common scan/query operations as well as some properties for additional attributes that can be calculated from the persisted data.

    I noticed today that when accessing a UTCDateTimeAttribute via cls in a class method, I can use the published methods of UTCDateTimeAttribute for scan operations as well as compare the value to a timezone naive datetime. Here I have a model called Experiment that has a start_date and end_date. The get_active_experiments class method should (and does) return all experiments where today's date is between the start and end date (or after the start date if end date does not exist):

    class Experiment(Model):
        experiment_id = UnicodeAttribute(hash_key=True)
        start_date = UTCDateTimeAttribute()
        end_date = UTCDateTimeAttribute(null=True)
    
        @classmethod
        def get_active_experiments(cls):
            return cls.scan(
                (cls.start_date <= dt.datetime.now()) &
                ((cls.end_date.does_not_exist()) |
                 (cls.end_date >= dt.datetime.now()))
            )
    

    If I execute Experiment.get_active_experiments() then there are no exceptions thrown and the correct result set is given (which is none in my current database), even though dt.datetime.now() is timezone-naive. Today, I added a new property so that I can get the status of experiments that have already been retrieved:

    class Experiment(Model):
        experiment_id = UnicodeAttribute(hash_key=True)
        start_date = UTCDateTimeAttribute()
        end_date = UTCDateTimeAttribute(null=True)
    
        @classmethod
        def get_active_experiments(cls):
    
            return cls.scan(
                (cls.start_date <= dt.datetime.now()) &
                ((cls.end_date.does_not_exist()) |
                 (cls.end_date >= dt.datetime.now()))
            )
    
        @property
        def status(self):
    
            """ Gets the current status of the experiment (active, upcoming, closed) """
    
            now = dt.datetime.now()
            start_date_in_past = self.start_date <= now
            end_date_in_future = (self.end_date.does_not_exist()) | (self.end_date >= now)
    
            if start_date_in_past and end_date_in_future:
                return 'active'
            elif start_date_in_past and not end_date_in_future:
                return 'closed'
            elif not start_date_in_past and end_date_in_future:
                return 'upcoming'
    
            return 'N/A'
    

    If I get an experiment and try to access .status, first an exception was thrown about comparing a timezone-aware datetime with a timezone-naive datetime:

    >>>e = Experiment.get('<some id>')
    >>>e.status
    
    TypeError: can't compare offset-naive and offset-aware datetimes
    

    So I modified my property to make now timezone aware:

    class Experiment(Model):
        experiment_id = UnicodeAttribute(hash_key=True)
        start_date = UTCDateTimeAttribute()
        end_date = UTCDateTimeAttribute(null=True)
    
        @classmethod
        def get_active_experiments(cls):
    
            return cls.scan(
                (cls.start_date <= dt.datetime.now()) &
                ((cls.end_date.does_not_exist()) |
                 (cls.end_date >= dt.datetime.now()))
            )
    
        @property
        def status(self):
    
            """ Gets the current status of the experiment (active, upcoming, closed) """
    
            now = pytz.utc.localize(dt.datetime.utcnow())
            start_date_in_past = self.start_date <= now
            end_date_in_future = (self.end_date.does_not_exist()) | (self.end_date >= now)
    
            if start_date_in_past and end_date_in_future:
                return 'active'
            elif start_date_in_past and not end_date_in_future:
                return 'closed'
            elif not start_date_in_past and end_date_in_future:
                return 'upcoming'
    
            return 'N/A'
    

    This worked to eliminate the timezone exception, but then a new exception was thrown for the does_not_exist() call:

    AttributeError: 'datetime.datetime' object has no attribute 'does_not_exist'
    

    So it seems that when you access the UTCDateTimeAttribute via cls, you don't have to consider timezone in any comparisons and can use the published methods such as does_not_exist() because you're accessing the class instance, but if you access it via self then you're accessing the deserialized datetime object.

    I'm not sure if this is expected or not, but it felt unexpected and inconsistent to me. I have a clear workaround but wanted to raise the issue in case it's a bug.

    opened by JaredStufftGD 3
  • Dynamic attributes like DynamicMapAttributes and increment/decrement operations.

    Dynamic attributes like DynamicMapAttributes and increment/decrement operations.

    apologies guys for creating an issue for this.

    Is it possible to create and use dynamic attributes like DynamicMapAttributes? What I am trying to achieve is, create and set/update/increment/decrement dynamic attributes and its values.

    for example,

    class Student(Model):
      id = NumberAttribute(null=False)
    
    subject_name = "English"
    subject_score = 0
    std = Student(id = std_id)
    std.<subject_name>= subject_score # since subjects and scores are dynamic.
    

    and then at a later point of time inside a transaction,

    Student.<subject_name>.add(10)

    I can getaway with putting Sub and Score in a DynamicMapAttribute, but when it comes to incrementing/decrementing it, it seems DynamoDB don't allow doing that for a map element.

    I tried creating a DynamicMapAttribute instance as shown below, but still no idea how can I do actions like increment/decrement on the attribute.

    class Student(DynamicMapAttribute):
      id = NumberAttribute(null=False)
    
    subject_name = "ABC"
    subject_init_score = 0
    
    std = Student(id = 101)
    std.__setattr__(subject_name, subject_reset_score)
    

    but then how do I increment/decrement the value witout retrieving it, and adding it (which in turn fails the purpose of ADD)

    Any idea?

    PS: I cannot do something like this in here, where a_date_time/a_number is static. https://github.com/pynamodb/PynamoDB/blob/58ff97f42f3ed1e546335e6b86f943ca3c23f7f3/pynamodb/attributes.py#L1000

        >>> class MyDynamicMapAttribute(DynamicMapAttribute):
        >>>     a_date_time = UTCDateTimeAttribute()  # raw map attributes cannot serialize/deserialize datetime values
        >>>
        >>> dynamic_map = MyDynamicMapAttribute()
        >>> dynamic_map.a_date_time = datetime.utcnow()
        >>> dynamic_map.a_number = 5
    
    opened by aczire 0
Releases(5.1.0)
A curated list of awesome tools for SQLAlchemy

Awesome SQLAlchemy A curated list of awesome extra libraries and resources for SQLAlchemy. Inspired by awesome-python. (See also other awesome lists!)

Hong Minhee (洪 民憙) 2.3k Oct 21, 2021
Pony Object Relational Mapper

Downloads Pony Object-Relational Mapper Pony is an advanced object-relational mapper. The most interesting feature of Pony is its ability to write que

null 2.7k Oct 25, 2021