TinyDB is a lightweight document oriented database optimized for your happiness :)

Overview

https://raw.githubusercontent.com/msiemens/tinydb/master/artwork/logo.png

Build Status Coverage Version

Quick Links

Introduction

TinyDB is a lightweight document oriented database optimized for your happiness :) It's written in pure Python and has no external dependencies. The target are small apps that would be blown away by a SQL-DB or an external database server.

TinyDB is:

  • tiny: The current source code has 1800 lines of code (with about 40% documentation) and 1600 lines tests.
  • document oriented: Like MongoDB, you can store any document (represented as dict) in TinyDB.
  • optimized for your happiness: TinyDB is designed to be simple and fun to use by providing a simple and clean API.
  • written in pure Python: TinyDB neither needs an external server (as e.g. PyMongo) nor any dependencies from PyPI.
  • works on Python 3.5+ and PyPy: TinyDB works on all modern versions of Python and PyPy.
  • powerfully extensible: You can easily extend TinyDB by writing new storages or modify the behaviour of storages with Middlewares.
  • 100% test coverage: No explanation needed.

To dive straight into all the details, head over to the TinyDB docs. You can also discuss everything related to TinyDB like general development, extensions or showcase your TinyDB-based projects on the discussion forum.

Supported Python Versions

TinyDB has been tested with Python 3.5 - 3.8 and PyPy.

Example Code

>>> from tinydb import TinyDB, Query
>>> db = TinyDB('/path/to/db.json')
>>> db.insert({'int': 1, 'char': 'a'})
>>> db.insert({'int': 1, 'char': 'b'})

Query Language

>>> User = Query()
>>> # Search for a field value
>>> db.search(User.name == 'John')
[{'name': 'John', 'age': 22}, {'name': 'John', 'age': 37}]

>>> # Combine two queries with logical and
>>> db.search((User.name == 'John') & (User.age <= 30))
[{'name': 'John', 'age': 22}]

>>> # Combine two queries with logical or
>>> db.search((User.name == 'John') | (User.name == 'Bob'))
[{'name': 'John', 'age': 22}, {'name': 'John', 'age': 37}, {'name': 'Bob', 'age': 42}]

>>> # More possible comparisons:  !=  <  >  <=  >=
>>> # More possible checks: where(...).matches(regex), where(...).test(your_test_func)

Tables

>>> table = db.table('name')
>>> table.insert({'value': True})
>>> table.all()
[{'value': True}]

Using Middlewares

>>> from tinydb.storages import JSONStorage
>>> from tinydb.middlewares import CachingMiddleware
>>> db = TinyDB('/path/to/db.json', storage=CachingMiddleware(JSONStorage))

Contributing

Whether reporting bugs, discussing improvements and new ideas or writing extensions: Contributions to TinyDB are welcome! Here's how to get started:

  1. Check for open issues or open a fresh issue to start a discussion around a feature idea or a bug
  2. Fork the repository on Github, create a new branch off the master branch and start making your changes (known as GitHub Flow)
  3. Write a test which shows that the bug was fixed or that the feature works as expected
  4. Send a pull request and bug the maintainer until it gets merged and published โ˜บ
Issues
  • Logo for TinyDB?

    Logo for TinyDB?

    It would be nice to have a logo for TinyDB. @eugene-eeo I've seen that some of your projects have logos. Would you like to create one for TinyDB?

    opened by msiemens 34
  • Add Table.write_back(), replacing documents by ids

    Add Table.write_back(), replacing documents by ids

    Compare to update(), ~replace()~ write_back() provide a more customizable workflow to modify documents. One can fistly search the documents and modify them by thier conditions, then input new douments with thier doc_ids to ~replace()~ write_back().

    Example Usage

    from tinydb import TinyDB, Query
    from tinydb.storages import MemoryStorage
    
    db = TinyDB(storage=MemoryStorage)
    
    db.purge()
    
    dataset = [
        {"stock": ["cookies", "apple"]},
        {"stock": ["brownies"]},
        {"stock": ["cake"]},
        {"stock": ["cookies", "milk"]},
        {"stock": ["cake"]},
        {"stock": ["pie", "cake"]},
        {"stock": ["apple"]},
    ]
    
    table = db.table('yumyum')
    table.insert_multiple(dataset)
    # search
    yum = Query()
    docs = table.search(yum.stock.any(["cake", "apple"]))
    # modify docs
    for doc in docs:
        for i, stock in enumerate(doc["stock"]):
            if stock == "cake":
                doc["stock"][i] = "water"
            if stock == "apple":
                doc["stock"][i] = "honey"
    # write back
    table.write_back(docs)
    

    Or ~replace~ write_back with new documents

    db.purge()
    
    table = db.table('yumyum')
    table.insert_multiple(dataset)
    
    docs = table.all()
    # save doc_ids
    doc_ids = [doc.doc_id for doc in docs]
    # modify docs
    for i, doc in enumerate(docs):
        if len(doc["stock"]) == 1:
            docs[i] = {"best": doc["stock"][0]}
        elif len(doc["stock"]) > 1:
            docs[i] = {"good": doc["stock"]}
    
    table.write_back(docs, doc_ids)
    

    Could be a convenient feature :)

    opened by davidlatwe 21
  • adding new functionality to `update()`

    adding new functionality to `update()`

    I'm looking to add new functionality to update. I would like to add an increment function first of all. I thought of something like this:

    db.update(inc('int'), cond, eids)
    

    And then change the update's present functionality to set, so:

    db.update(set({'char': 'a'}), cond, eids)
    

    Alternatively an interface such as this may be preferable:

    db.update(cond, eids).inc('int')` and `db.update(cond, eids).set({'char': 'a'})
    

    This change would allow the addition of new functions such as max, min, etc going forward.

    I'm looking for thoughts/opinions and how best to structure this?

    opened by wearp 21
  • [FR] add encoding argument for JSONStorage

    [FR] add encoding argument for JSONStorage

    Currently, JSONStorage use open(..., encoding=None) to open the json file.

    But I try to use utf-8 encoding.

    enhancement 
    opened by Cologler 18
  • Discussion: adding an index extension

    Discussion: adding an index extension

    Hello, I am looking for some thoughts on a Tinydb extension that i am mulling over how to implement. Considering there is an active discussion about ideas for a 4.0 #284 I thought this might be a good time to bring it up.

    To start off with here is a quick description of what I am doing in my application that I am interested in integrating directly into a TinyDB extension. Currently I have a DB of a size that is on the upper end of what TinyDB is recommended for. In that DB i have a particular field that is used very often in DB "queries". The values of that field are not guaranteed to be unique, however the number of entries that share that same value will always be small relative to the number of entries in the DB. Additionally the values associated with the field are sort-able. To speed up my application i have been pulling the entries out of the DB and creating a SortedCollection sorted by that particular field and I maintain the state of that collection along with every database modification, kind of like smart-cache. Then when I get to a point where i would normally perform a query on that field (or a combination of that field and something else), i use the sorted collection for faster lookup.

    What I am trying to figure out is what would be the best way to integrate this behavior into TinyDB, so that it can be done transparently with the query interface. To start i know i would need to create a TableClass to implement the management and usage of the sorted "Index(s)". There also must be some way of specifying which field(s) should be indexed, either statically at the creation of the database object or dynamically based on the query being used. I am considering whether or not it makes sense to subclass the Query class so that the index is only used for a specific type of query.

    Finally I need a way to determine, within each query consuming call, whether the query can be performed on an index or must be done on the database directly. I am imagining that I would need to do something like decomposing the query "path" to determine if it is a direct query of an indexed field or if it is a top level AND of a query on that field and something else. I think those are the only cases that would work, a top level OR definitely wouldn't.

    I am interested to here any thoughts anyone might have on the feasibility of this kind of extension, how i might go about implementing it, or anything i may have missed. Despite having gotten this far, and created a few simple customization to TinyDB, I am still very new to this library and python in general. Any assistance would be greatly appriciated.

    discussion stale 
    opened by dcflachs 15
  • __repr__ methods for classes TinyDB, Table and Query

    __repr__ methods for classes TinyDB, Table and Query

    Hello, considers https://github.com/msiemens/tinydb/issues/226.

    I have added basic __repr__ implementation for mentioned classes, would like to know if you see it as sufficient and which else classes should be implemented within the same way.

    You may try the reprs with:

    from tinydb import database, queries
    
    print(repr(database.TinyDB('test.json')))
    print(repr(database.Table(database.JSONStorage('test.json'), 'some name')))
    
    Fruit = queries.Query()
    print(repr(Fruit.type == 'peach'))
    

    Achieving results:

    TinyDB(tables={'_default'}, tables_count=1, default_table_documents_count=0, all_tables_documents_count=['_default=0'])
    Table(name='some name', total=1, storage=<tinydb.storages.JSONStorage object at 0x10212c2e8>)
    QueryImpl('==', ('type',), 'peach')
    
    opened by Xarvalus 14
  • Storage interface to receive element in context

    Storage interface to receive element in context

    What are your thoughts on passing the element from the write context (e.g. Table's insert) to the storage instead of all the data. This would allow storages that don't have to write all of the data every time any one (or a few) elements are created/modified/deleted.

    BTW, you really nailed the API in this project. Nice and simple.

    opened by fictorial 14
  • Refactoring Elements

    Refactoring Elements

    I was writing a package to build a simple graph on top of TinyDB, and came into a problem similar to #97. After reading the TinyDB source code, it isn't really possible (or at least comes with a huge performance penalty).

    At the moment, whilst you can replace the Table class used by the TinyDB class, you have no control over the Element or StorageProxy classes. Instances of StorageProxy are created by the TinyDB class, and passed to the Table class. The StorageProxy then creates Element instances, with no way of changing it.

    If I were to extend Table and replace the _read method with my own in order to use my own objects, there would be quite a performance penalty. The reason being, JSONStorage by way of using the json package has created dict objects for every element, then StorageProxy has gone and replaced each of those dicts with an Element object, and then I would replace them again with my own. Additionally, whilst I could pass my object to the json module with the object_hook argument, this object would then be replaced by the StorageProxy anyway, defeating the purpose.

    The way I see it, the only purpose of the Element class is to give the element the eid attribute. I think this can be acheived in 2 ways;

    1. It would be possible to create a custom json decoder, based upon json/simplejson. I have looked over the source code for those packages, and it would be possible to maintain a 'breadcrumb' list as the elements were decoded, and thus the eid would be known as the element is created. In this instance, the element object to create could be defined by the object_hook argument. The problem with this approach is that you couldn't use a different json package, as they would not have the capability of determining the eid as they decoded. For this reason, I don't think this is a good option.
    2. Make the eid a field on the element. Then, any json package will work, and the object_hook argument that a lot of json packages seem to implement would work too.

    I think option 2 opens up a lot of possibilies. The 'default' Element used by TinyDB could just be the dict as per json, or a slightly extended version with __getattr__ that allows the eid to be returned as a property. An optional Element could allow all fields to be accessed as properties. And of course, people can pass in their own objects.

    There's a lot more specific implementation details I could go into, but I'll let you respond to this first.

    opened by neRok00 14
  • Allow datetime objects in TinyDB

    Allow datetime objects in TinyDB

    Since JSON is the default serilization format for TinyDB there's the datetime problem:

    >>> from tinydb import TinyDB
    >>> from datetime import datetime
    >>> db = TinyDB("db.json")
    >>> db.insert({"date": datetime.now()})
    ...
    TypeError: datetime.datetime(2015, 2, 21, 17, 24, 17, 828569) is not JSON serializable
    

    Many other databases handle datetime conversion for the user, and I would very much like TinyDB to do the same (It's usually fixed by specifying a custom encoding and a corresponding decoder when reading from the database).

    Do you think this is a good idea?

    opened by EmilStenstrom 14
  • [feature] Add timestamp on document insertion/update

    [feature] Add timestamp on document insertion/update

    Hi Markus,

    My use-case for TinyDB requires a timestamp to be added to each document upon insertion and when that document is updated. This is useful for understanding database activities, establishing statistics and trends etc.

    I have implemented this feature, nonetheless I think this is a fairly simple addition and would like to discuss and see if other people are interested in this feature.

    A user settable flag is defined in the Table class. The default value is false (i.e. the feature is not active) so as to not break compatibility.

    On insertion, the field 'created' is added to the document dictionary and initialized with the current UTC time in string format. The field 'updated' is also added but is initialized as empty string ''.

    On update of a document, the field 'updated' is set to the current UTC time in string format.

    I don't really care what time zone/format is chosen as long as it is consistent, UTC seems like a good choice. This might be a point of issue for some people and maybe it will require some customization down the track, although I like the simplicity of having only one time format.

    Let me know what you think.

    Cheers,

    Miles

    opened by mcaples 13
  • Allow doc_id to be specified when using insert_multiple

    Allow doc_id to be specified when using insert_multiple

    Currently doc_id cannot be used with insert_multiple because no check is performed for the Document class. This PR adds the same code as used in insert to insert_multiple, thus allowing doc_id to specified for any item being added.

    pinned 
    opened by waylonflinn 5
  • Restructure and extend documentation

    Restructure and extend documentation

    This is a note to myself to rework, restructure and extend the documentation based on the principles layed out in https://documentation.divio.com/introduction/ (and https://v3.vuejs.org/guide/contributing/writing-guide.html for that matter).

    I considering adding a how-to section in the process. If anyone has ideas about what topics would be useful for short how-to guides/articles, feel free to comment on this issue ๐Ÿ™ƒ

    enhancement pinned 
    opened by msiemens 4
  • Move a document from a certain database to another one error

    Move a document from a certain database to another one error

    I'm trying to move a document to another database but when I do this twice I get this error:

    AssertionError: doc_id 1 already exists
    

    This is probably because I insert the document in the 2th database and then remove it from the first database. This causes that the second document that I insert into database 2 also has document id 1. Is there any workaround for this struggle?

    My first python file:

    from tinydb import TinyDB, Query
    
    db = TinyDB('db1.json')
    db2 = TinyDB('db2.json')
    
    def moveDocumentToDb2(id):
        db1 = TinyDB('db1.json')
        db2 = TinyDB('db2.json')
    
        q = Query()
    
        document = db.search(q.id == id)[0]
    
        db2.insert(document)
    
        db1.remove(q.id == id)
    
    # Inserted document to the first database and it gets the doc_id 1
    db.insert({'id': 1, 'type': 'apple', 'count': 7})
    
    # Move the 'apple' document to db2
    moveDocumentToDb2(1)
    

    My second python file:

    from tinydb import TinyDB, Query
    
    db = TinyDB('db1.json')
    db2 = TinyDB('db2.json')
    
    def moveDocumentToDb2(id):
        db1 = TinyDB('db1.json')
        db2 = TinyDB('db2.json')
    
        q = Query()
    
        document = db.search(q.id == id)[0]
    
        db2.insert(document)
    
        db1.remove(q.id == id)
    
    # This document will also get the doc_id 1 (this is the issue)
    db.insert({'id': 2, 'type': 'peach', 'count': 3}) 
    
    # Now when I try to add the 'peach' document to db2 I get the error 
    # because they both have the same doc_id
    moveDocumentToDb2(2)
    
    pinned 
    opened by AlgoQ 2
  • Support document type

    Support document type

    #332

    pinned 
    opened by Cologler 1
  • feat: support `Document` type on update/remove etc

    feat: support `Document` type on update/remove etc

    Currently:

    table = ...
    items = table.all()
    ... # work with items, than:
    for item in items:
        table.update(item, doc_ids=[item.doc_id])
    

    Expected:

    table = ...
    items = table.all()
    ... # work with items, than:
    for item in items:
        table.update(item)
    
    discussion pinned 
    opened by Cologler 8
  • pypi tar-ball doesn't include tests

    pypi tar-ball doesn't include tests

    Hi

    it would be great, if the pypi tar-ball could again include the tests, since these are often used, for example, for linux distributions and there having the tests and being able to run them is very valuable after building the package for the distribution.

    thanks

    Arun

    pinned 
    opened by arunpersaud 1
Releases(v4.5.2)
Shelf DB is a tiny document database for Python to stores documents or JSON-like data

Shelf DB Introduction Shelf DB is a tiny document database for Python to stores documents or JSON-like data. Get it $ pip install shelfdb shelfquery S

Um Nontasuwan 30 Nov 10, 2021
A very simple document database

DockieDb A simple in-memory document database. Installation Build the Wheel Fork or clone this repository and run python setup.py bdist_wheel in the r

null 1 Jan 16, 2022
Elara DB is an easy to use, lightweight NoSQL database that can also be used as a fast in-memory cache.

Elara DB is an easy to use, lightweight NoSQL database written for python that can also be used as a fast in-memory cache for JSON-serializable data. Includes various methods and features to manipulate data structures in-memory, protect database files and export data.

Saurabh Pujari 91 Jan 15, 2022
LightDB is a lightweight JSON Database for Python

LightDB What is this? LightDB is a lightweight JSON Database for Python that allows you to quickly and easily write data to a file Installing pip3 ins

Stanislaw 12 Jan 7, 2022
A Simple , โ˜๏ธ Lightweight , ๐Ÿ’ช Efficent JSON based database for ๐Ÿ Python.

A Simple, Lightweight, Efficent JSON based DataBase for Python The current stable version is v1.6.1 pip install pysondb==1.6.1 Support the project her

PysonDB 171 Jan 11, 2022
AWS Tags As A Database is a Python library using AWS Tags as a Key-Value database.

AWS Tags As A Database is a Python library using AWS Tags as a Key-Value database. This database is completely free* ??

Oren Leung 37 Sep 30, 2021
TelegramDB - A library which uses your telegram account as a database for your projects

TelegramDB A library which uses your telegram account as a database for your projects. Basic Usage from pyrogram import Client from telegram import Te

Kaizoku 18 Jan 21, 2022
securedb is a fast and lightweight Python framework to easily interact with JSON-based encrypted databases.

securedb securedb is a Python framework that lets you work with encrypted JSON databases. Features: newkey() to generate an encryption key write(key,

Filippo Romani 2 Dec 29, 2021
This is a simple graph database in SQLite, inspired by

This is a simple graph database in SQLite, inspired by "SQLite as a document database".

Denis Papathanasiou 917 Jan 14, 2022
Python function to extract all the rows from a SQLite database file while iterating over its bytes, such as while downloading it

Python function to extract all the rows from a SQLite database file while iterating over its bytes, such as while downloading it

Department for International Trade 12 Dec 16, 2021
A simple GUI that interacts with a database to keep track of a collection of US coins.

CoinCollectorGUI A simple gui designed to interact with a database. The goal of the database is to make keeping track of collected coins simple. The G

Builder212 1 Nov 9, 2021
Makes google's political ad database actually useful

Making Google's political ad transparency library suck less This is a series of scripts that takes Google's political ad transparency data and makes t

The Guardian 1 Dec 4, 2021
MyReplitDB - the most simplistic and easiest wrapper to use for replit's database system.

MyReplitDB is the most simplistic and easiest wrapper to use for replit's database system. Installing You can install it from the PyPI Or y

kayle 3 Dec 28, 2021
Decentralised graph database management system

Decentralised graph database management system To get started clone the repo, and run the command below. python3 database.py Now, create a new termina

Omkar Patil 1 Dec 16, 2021
A Persistent Embedded Graph Database for Python

Cog - Embedded Graph Database for Python cogdb.io New release: 2.0.5! Installing Cog pip install cogdb Cog is a persistent embedded graph database im

Arun Mahendra 145 Jan 13, 2022
A Painless Simple Way To Create Schema and Do Database Operations Quickly In Python

PainlessDB - Taking Your Pain away to the moon ?? Contribute ยท Community ยท Documentation ?? Introduction : PainlessDB is a Python-based free and open-

Aiden Ellis 3 Dec 21, 2021
HTTP graph database built in Python 3

KiwiDB HTTP graph database built in Python 3. Reference Format References are strings in the format: {[email protected]} Authentication Currently, t

JanCraft 1 Dec 17, 2021
A NoSQL database made in python.

CookieDB A NoSQL database made in python.

cookie 2 Dec 23, 2021
Tiny local JSON database for Python.

Pylowdb Simple to use local JSON database ?? # This is pure python, not specific to pylowdb ;) db.data['posts'] = ({ 'id': 1, 'title': 'pylowdb is awe

Hussein Sarea 1 Jan 4, 2022