Easy-to-use data handling for SQL data stores with support for implicit table creation, bulk loading, and transactions.

Overview

dataset: databases for lazy people

build

In short, dataset makes reading and writing data in databases as simple as reading and writing JSON files.

Read the docs

To install dataset, fetch it with pip:

$ pip install dataset

Note: as of version 1.0, dataset is split into two packages, with the data export features now extracted into a stand-alone package, datafreeze. See the relevant repository here.

Issues
  • Enhancement

    Enhancement

    This PR adds following supports:

    • Table.delete return True or False. This can handle the case that try to delete not existed row/column.
    • add Table.__repr__
    enhancement 
    opened by cli248 13
  • sqlite3 OperationalError

    sqlite3 OperationalError

    Hello there,

    I always get following error

    Traceback (most recent call last): File "./compile_results.py", line 58, in for i in table_send: File "/usr/local/lib/python2.7/dist-packages/dataset/persistence/util.py", line 73, in next return self.next() File "/usr/local/lib/python2.7/dist-packages/dataset/persistence/util.py", line 67, in next if not self._next_chunk(): File "/usr/local/lib/python2.7/dist-packages/dataset/persistence/util.py", line 58, in _next_chunk chunk = self.result_proxy.fetchmany(self.step) File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/result.py", line 997, in fetchmany self.cursor, self.context) File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1341, in _handle_dbapi_exception exc_info File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/util/compat.py", line 200, in raise_from_cause reraise(type(exception), exception, tb=exc_tb) File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/result.py", line 990, in fetchmany l = self.process_rows(self._fetchmany_impl(size)) File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/result.py", line 907, in _fetchmany_impl return self.cursor.fetchmany(size) sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) SQL logic error or missing database

    if I try to access dataset number 5000 of my sqlite database.

    I am importing two databases into memory with about 60000 rows, which works. I can also access the row 5000 (here id 10000) without error. However, as soon as the sqlite database reaches the for loop at this position, the whole python program crashes with the error shown above. So it seems like the database is working and got the imported data correctly (otherwise I could not access it before the for loop) - but something weird happens then. Any ideas?

    Thank you very much for your help and great product :)

    # import json from sender into memory database
    db_send = dataset.connect('sqlite:///:memory:')
    table_send = db_send['record']
    with open(sendfile) as send:
        s = json.load(send)
    table_send.insert_many(s["results"])
    send.close()
    
    # import json from receiver into memory database
    db_recv = dataset.connect('sqlite:///:memory:')
    table_recv = db_recv['record']
    with open(recvfile) as recv:
        r = json.load(recv)
    table_recv.insert_many(r["results"])
    recv.close()
    
    # init variables for matching counter
    count_search=0
    count_found=0
    
    #print len(table_send)
    #test = table_send.find_one(id=9998)
    #print test['id']
    #test = table_send.find_one(id=10000)
    #print test['id']
    #test = table_send.find_one(id=10002)
    #print test['id']
    
    # iterate through sender results, set file as sent
    # match checksum of sent files to received files
    # if matches, include time details of receiver and set received
    # update results
    for i in table_send:
            print i['id']
            i['file_sent'] = 1
            count_search += 1
            res_recv = table_recv.find_one(filechecksum=i['filechecksum'])
            i['file_sent'] = 1
            if res_recv != None:
                    count_found += 1
                    #print i['filechecksum'] + ':' + res_recv['filechecksum']
                    i['r_time_start'] = res_recv['time_start']
                    i['r_time_end'] = res_recv['time_end']
                    i['r_time'] = res_recv['time']
                    i['file_received'] = 1
            table_send.update(i,['id'])
    
    opened by nmaas87 12
  • How do I silence the

    How do I silence the "DEBUG:dataset.persistence…" messages?

    I'm getting logging messages appearing in my script output. How do I silence them? And why are they even there in the first place?

    >>> import dataset
    >>> dt = dataset.connect('sqlite:///:memory:')
    >>> table = dt['test']
    DEBUG:dataset.persistence.database:Creating table: test on Engine(sqlite:///:memory:)
    >>> table.insert({"id": "bv56fzi", "name": "This will fail"})
    DEBUG:dataset.persistence.table:Creating column: name (<class 'sqlalchemy.types.UnicodeText'>) on 'test'
    
    opened by zarino 12
  • Implement handling of iterator rows

    Implement handling of iterator rows

    Updated the input_many and update_many functions so that an iterator rows input will be handled properly, as was mentioned by @pudo in my last pull request (#298). It may be a bit over-complicated, so may require modification.

    opened by abmyii 10
  • Support for storing Numpy arrays as LargeBinary objects

    Support for storing Numpy arrays as LargeBinary objects

    It is useful to be able to store Numpy arrays in as a large binary object in a database. Here is an implementation that uses Numpy's native binary format to convert an array to a binary string and put it into the database. When reading entries from the database, there is a function that checks if the binary string starts with \x93NUMPY (which is the first 6 bytes of any numpy array) and converts the string back to a numpy array.

    opened by davidkleiven 10
  • Close connection automatically when with is used

    Close connection automatically when with is used

    If I connect to a database using the with statement like this:

    with dataset.connect(path_to_database) as db:
        pass
    

    Does the connection to database close automatically once we are outside the scope of the with statement? Or should we still explicitly close the connection?

    opened by ritviksahajpal 10
  • connecting times out when connecting to a db with an

    connecting times out when connecting to a db with an "@" in the password.

    This might be a sqlalchemy thing, but when I try to connect to a postgres db with the following (example) url:

    import dataset
    db_url = "postgres://drsuess:[email protected]@123.435.123.1:5432/lorax"
    db = dataset.connect(db_url)
    

    The connection attempt just freezes indefinitely. I tried changing the password and it remedied the situation, but you might want to add a check / warning for this.

    opened by abelsonlive 9
  • connection schema handling on inserts

    connection schema handling on inserts

    On my postgresql database (9.3.2), executing the quickstart examples works when connection schema is not specified:

    >>>import dataset
    >>>db = dataset.connect()
    >>>table = db['person']
    >>>print(table.columns)
    ['id']
    >>>table.insert(dict(name='John Doe', age=46))
    1
    

    However, when connection schema is specified, the created table is not found when using insert - it seems to be looking instead in the current search_path:

    >>>import dataset
    >>>db = dataset.connect(schema="myschema")
    >>>table = db['person']
    >>>print(table.columns)
    ['id']
    >>>table.insert(dict(name='John Doe', age=46))
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/usr/local/Cellar/python/2.7.6/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/dataset/persistence/table.py", line 66, in insert
        self._ensure_columns(row, types=types)
      File "/usr/local/Cellar/python/2.7.6/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/dataset/persistence/table.py", line 198, in _ensure_columns
        self.create_column(column, _type)
      File "/usr/local/Cellar/python/2.7.6/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/dataset/persistence/table.py", line 224, in create_column
        Column(name, type)
      File "/usr/local/Cellar/python/2.7.6/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/alembic/operations.py", line 365, in add_column
        schema=schema
      File "/usr/local/Cellar/python/2.7.6/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/alembic/ddl/impl.py", line 127, in add_column
        self._exec(base.AddColumn(table_name, column, schema=schema))
      File "/usr/local/Cellar/python/2.7.6/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/alembic/ddl/impl.py", line 76, in _exec
        conn.execute(construct, *multiparams, **params)
      File "/usr/local/Cellar/python/2.7.6/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1651, in execute
        return connection.execute(statement, *multiparams, **params)
      File "/usr/local/Cellar/python/2.7.6/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 717, in execute
        return meth(self, multiparams, params)
      File "/usr/local/Cellar/python/2.7.6/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/sqlalchemy/sql/ddl.py", line 67, in _execute_on_connection
        return connection._execute_ddl(self, multiparams, params)
      File "/usr/local/Cellar/python/2.7.6/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 771, in _execute_ddl
        compiled
      File "/usr/local/Cellar/python/2.7.6/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 927, in _execute_context
        context)
      File "/usr/local/Cellar/python/2.7.6/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1076, in _handle_dbapi_exception
        exc_info
      File "/usr/local/Cellar/python/2.7.6/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/sqlalchemy/util/compat.py", line 185, in raise_from_cause
        reraise(type(exception), exception, tb=exc_tb)
      File "/usr/local/Cellar/python/2.7.6/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 920, in _execute_context
        context)
      File "/usr/local/Cellar/python/2.7.6/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/sqlalchemy/engine/default.py", line 425, in do_execute
        cursor.execute(statement, parameters)
    sqlalchemy.exc.ProgrammingError: (ProgrammingError) relation "person" does not exist
     'ALTER TABLE person ADD COLUMN age INTEGER' {}
    
    opened by smnorris 9
  • Unable to Pip Install Dataset

    Unable to Pip Install Dataset

    After a clean install of OSX Mojave 10.14.2 I can no longer PIP install Dataset into a Anaconda Python3.6.5 environment. The error below lists GCC as a problem during the build. I attempted to install GCC again using Homebrew to no avail. I then wiped and reinstalled Mojave with the same results.

    I also attempted to PIP install Dataset into the root system environment with the same results.

    I can install other python modules into both the root system and Anaconda environments.

    Any direction on getting Dataset working again would be greatly appreciated. Errors below:

    Collecting dataset Using cached https://files.pythonhosted.org/packages/ba/76/5b8fe80f487bfa90b8d0094c3a27150fe797b1fbe02dec31b048fd52448d/dataset-1.1.0-py2.py3-none-any.whl Collecting sqlalchemy>=1.1.0 (from dataset) Requirement already satisfied: six>=1.11.0 in /Users/jz/anaconda3/envs/bridge/lib/python3.6/site-packages (from dataset) (1.12.0) Collecting normality>=0.5.1 (from dataset) Using cached https://files.pythonhosted.org/packages/7f/38/9ed937afe0fe95a281180d8820d46eb955bc1b0f656f3b301635608a5c1f/normality-1.0.0-py2.py3-none-any.whl Collecting alembic>=0.6.2 (from dataset) Collecting banal>=0.4.1 (from normality>=0.5.1->dataset) Using cached https://files.pythonhosted.org/packages/27/07/376d107ccf0dd80efdd8e4310b1d7d34a2d54f5389f946d58950c6dd70cc/banal-0.4.2-py2.py3-none-any.whl Collecting chardet (from normality>=0.5.1->dataset) Using cached https://files.pythonhosted.org/packages/bc/a9/01ffebfb562e4274b6487b4bb1ddec7ca55ec7510b22e4c51f14098443b8/chardet-3.0.4-py2.py3-none-any.whl Collecting pyicu>=1.9.3 (from normality>=0.5.1->dataset) Using cached https://files.pythonhosted.org/packages/c2/15/0af20b540c828943b6ffea5677c86e908dcac108813b522adebb75c827c1/PyICU-2.2.tar.gz Collecting Mako (from alembic>=0.6.2->dataset) Requirement already satisfied: python-dateutil in /Users/jz/anaconda3/envs/bridge/lib/python3.6/site-packages (from alembic>=0.6.2->dataset) (2.7.5) Collecting python-editor>=0.3 (from alembic>=0.6.2->dataset) Collecting MarkupSafe>=0.9.2 (from Mako->alembic>=0.6.2->dataset) Using cached https://files.pythonhosted.org/packages/da/fc/2979c425ad23d528d6ac2e1f3efdc28e572fa1e1fbd5a75171cbdd7ddaa5/MarkupSafe-1.1.0-cp36-cp36m-macosx_10_6_intel.whl Building wheels for collected packages: pyicu Running setup.py bdist_wheel for pyicu: started Running setup.py bdist_wheel for pyicu: finished with status 'error' Complete output from command /Users/jz/anaconda3/envs/bridge/bin/python -u -c "import setuptools, tokenize;file='/private/var/folders/8v/wnw63nls49572vj7073gzn4w0000gn/T/pycharm-packaging/pyicu/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" bdist_wheel -d /private/var/folders/8v/wnw63nls49572vj7073gzn4w0000gn/T/pip-wheel-q_nht7jo --python-tag cp36:

    Building PyICU 2.2 for ICU 58.2

    Adding CXXFLAGS="-I/Users/jz/anaconda3/include" from /Users/jz/anaconda3/bin/icu-config Adding LDFLAGS="-Wl,-rpath,/Users/jz/anaconda3/lib -L/Users/jz/anaconda3/lib -licui18n -licuuc -licudata" from /Users/jz/anaconda3/bin/icu-config running bdist_wheel running build running build_py creating build creating build/lib.macosx-10.7-x86_64-3.6 copying PyICU.py -> build/lib.macosx-10.7-x86_64-3.6 creating build/lib.macosx-10.7-x86_64-3.6/icu copying icu/init.py -> build/lib.macosx-10.7-x86_64-3.6/icu warning: build_py: byte-compiling is disabled, skipping.

    running build_ext building '_icu' extension creating build/temp.macosx-10.7-x86_64-3.6 gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/jz/anaconda3/envs/bridge/include -arch x86_64 -I/Users/jz/anaconda3/envs/bridge/include -arch x86_64 -I/Users/jz/anaconda3/envs/bridge/include/python3.6m -c _icu.cpp -o build/temp.macosx-10.7-x86_64-3.6/_icu.o -DPYICU_VER="2.2" -I/Users/jz/anaconda3/include warning: include path for stdlibc++ headers not found; pass '-std=libc++' on the command line to use the libc++ standard library instead [-Wstdlibcxx-not-found] In file included from _icu.cpp:27: ./common.h:38:13: error: unknown type name 'decltype' typedef decltype(nullptr) nullptr_t; ^ ./common.h:38:30: error: expected ';' after top level declarator typedef decltype(nullptr) nullptr_t; ^ ; In file included from _icu.cpp:27: In file included from ./common.h:106: In file included from /Users/jz/anaconda3/include/unicode/unistr.h:33: /Users/jz/anaconda3/include/unicode/std_string.h:35:10: fatal error: 'string' file not found #include ^~~~~~~~ 1 warning and 3 errors generated. error: command 'gcc' failed with exit status 1


    Running setup.py clean for pyicu Failed to build pyicu Installing collected packages: sqlalchemy, banal, chardet, pyicu, normality, MarkupSafe, Mako, python-editor, alembic, dataset Running setup.py install for pyicu: started Running setup.py install for pyicu: finished with status 'error' Complete output from command /Users/jz/anaconda3/envs/bridge/bin/python -u -c "import setuptools, tokenize;file='/private/var/folders/8v/wnw63nls49572vj7073gzn4w0000gn/T/pycharm-packaging/pyicu/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /private/var/folders/8v/wnw63nls49572vj7073gzn4w0000gn/T/pip-record-yz3yw6d5/install-record.txt --single-version-externally-managed --compile:

    Building PyICU 2.2 for ICU 58.2
    
    Adding CXXFLAGS="-I/Users/jz/anaconda3/include" from /Users/jz/anaconda3/bin/icu-config
    Adding LDFLAGS="-Wl,-rpath,/Users/jz/anaconda3/lib -L/Users/jz/anaconda3/lib -licui18n -licuuc -licudata" from /Users/jz/anaconda3/bin/icu-config
    running install
    running build
    running build_py
    creating build
    creating build/lib.macosx-10.7-x86_64-3.6
    copying PyICU.py -> build/lib.macosx-10.7-x86_64-3.6
    creating build/lib.macosx-10.7-x86_64-3.6/icu
    copying icu/__init__.py -> build/lib.macosx-10.7-x86_64-3.6/icu
    warning: build_py: byte-compiling is disabled, skipping.
    
    running build_ext
    building '_icu' extension
    creating build/temp.macosx-10.7-x86_64-3.6
    gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/jz/anaconda3/envs/bridge/include -arch x86_64 -I/Users/jz/anaconda3/envs/bridge/include -arch x86_64 -I/Users/jz/anaconda3/envs/bridge/include/python3.6m -c _icu.cpp -o build/temp.macosx-10.7-x86_64-3.6/_icu.o -DPYICU_VER="2.2" -I/Users/jz/anaconda3/include
    warning: include path for stdlibc++ headers not found; pass '-std=libc++' on the command line to use the libc++ standard library instead [-Wstdlibcxx-not-found]
    In file included from _icu.cpp:27:
    ./common.h:38:13: error: unknown type name 'decltype'
        typedef decltype(nullptr) nullptr_t;
                ^
    ./common.h:38:30: error: expected ';' after top level declarator
        typedef decltype(nullptr) nullptr_t;
                                 ^
                                 ;
    In file included from _icu.cpp:27:
    In file included from ./common.h:106:
    In file included from /Users/jz/anaconda3/include/unicode/unistr.h:33:
    /Users/jz/anaconda3/include/unicode/std_string.h:35:10: fatal error: 'string' file not found
    #include <string>
             ^~~~~~~~
    1 warning and 3 errors generated.
    error: command 'gcc' failed with exit status 1
    
    ----------------------------------------
    

    Failed building wheel for pyicu Command "/Users/jz/anaconda3/envs/bridge/bin/python -u -c "import setuptools, tokenize;file='/private/var/folders/8v/wnw63nls49572vj7073gzn4w0000gn/T/pycharm-packaging/pyicu/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /private/var/folders/8v/wnw63nls49572vj7073gzn4w0000gn/T/pip-record-yz3yw6d5/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /private/var/folders/8v/wnw63nls49572vj7073gzn4w0000gn/T/pycharm-packaging/pyicu/

    opened by liquidgenius 8
  • Fix #90: added `with` statement support to the Database class.

    Fix #90: added `with` statement support to the Database class.

    This PR adds support of with statement for Database class. Basic usage:

    with dataset.connect('...') as db:
            db['...'].insert()
    

    If an exception is raised within the scope of with then all changes are rolled back and exception is passed forward. Otherwise all changes are committed upon leaving with scope.

    Tests are added as well.

    This PR partially fixes the issue https://github.com/pudo/dataset/issues/90.

    opened by victorkashirin 8
  • Doesn't roundtrip datetimes - at least in SQLite.

    Doesn't roundtrip datetimes - at least in SQLite.

    The actual problem underlying #96.

    If you put a datetime into the database, you get a unicode string back. If you put the unicode string into the database, #96 stings you and it crashes.

    (Roundtripping is important for "fetch this row from the DB, mutate it and save it back" scenarios.)

    >>> import dataset
    >>> db = dataset.connect("sqlite:///:memory:")
    >>> table = db['table']
    >>> import datetime
    >>> table.insert({'datecol': datetime.datetime.now()})
    1
    >>> for row in table: pass
    ...
    >>> row
    OrderedDict([(u'id', 1), (u'datecol', u'2014-06-04 12:05:32.821552')])
    >>> type(row['datecol'])
    <type 'unicode'>
    >>> table.insert({'datecol': row['datecol']})
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/home/venv/local/lib/python2.7/site-packages/dataset/persistence/table.py", line 67, in insert
        res = self.database.executable.execute(self.table.insert(row))
      File "/home/venv/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1682, in execute
        return connection.execute(statement, *multiparams, **params)
      File "/home/venv/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 720, in execute
        return meth(self, multiparams, params)
      File "/home/venv/local/lib/python2.7/site-packages/sqlalchemy/sql/elements.py", line 317, in _execute_on_connection
        return connection._execute_clauseelement(self, multiparams, params)
      File "/home/venv/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 817, in _execute_clauseelement
        compiled_sql, distilled_params
      File "/home/venv/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 884, in _execute_context
        None, None)
      File "/home/venv/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1108, in _handle_dbapi_exception
        exc_info
      File "/home/venv/local/lib/python2.7/site-packages/sqlalchemy/util/compat.py", line 185, in raise_from_cause
        reraise(type(exception), exception, tb=exc_tb)
      File "/home/venv/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 880, in _execute_context
        context = constructor(dialect, self, conn, *args)
      File "/home/venv/local/lib/python2.7/site-packages/sqlalchemy/engine/default.py", line 572, in _init_compiled
        param.append(processors[key](compiled_params[key]))
      File "/home/venv/local/lib/python2.7/site-packages/sqlalchemy/sql/type_api.py", line 864, in process
        return process_param(value, dialect)
      File "/home/venv/local/lib/python2.7/site-packages/dataset/persistence/util.py", line 87, in process_bind_param
        return (value / 1000 - self.epoch).total_seconds()
    sqlalchemy.exc.StatementError: unsupported operand type(s) for /: 'unicode' and 'int' (original cause: TypeError: unsupported operand type(s) for /: 'unicode' and 'int') u'INSERT INTO "table" (datecol) VALUES (?)' []
    
    opened by scraperdragon 8
  • enable multiple AND filters on the same column

    enable multiple AND filters on the same column

    This allows to specify multiple filters for the same column.

    General syntax

    Updates table.find(*_clauses, **kwargs) such that kwargs can be specified as { 'column__<op>': value }

    Example

    # height >= 20 AND height <= 50
    table.find({'height__gte': 20, 'height__lte: 50'}
    

    Relates to

    #213

    opened by miraculixx 0
  • Issue when creating id field

    Issue when creating id field

    Try to upload mssql table to postgres table. When i try insert data - its raise error. datasets convert uniqueidentifier field to serial4

    How manually setup column type?

    image image

    table.upsert_many(rows = list(data), keys = data.keys, chunk_size=1000)

    (psycopg2.errors.InvalidTextRepresentation) invalid input syntax for type integer: "CDEA5888-293A-4D0B-B2DE-0001C633F4FE"

    opened by ahvahsky2008 0
  • proposing  dataset-orm as an add on

    proposing dataset-orm as an add on

    I needed an ORM-like front-end to dataset, so I made one. dataset-orm also includes a files-column type and API that supports storing files of arbitrary size using simple put()/get() semantics.

    If you like it happy to merge or perhaps it's better to keept it as a seperate add-on

    PyPi https://pypi.org/project/dataset-orm GitHub https://github.com/productaize/dataset-orm

    #308

    opened by miraculixx 0
  • fix: Handle list type as JSON.

    fix: Handle list type as JSON.

    Before this change, if you had a list of data, it would be inferred as text type. Then when we try to insert data, we would get the following error:

    sqlite3.InterfaceError: Error binding parameter 4 - probably unsupported type.

    Handling lists as JSON type resolves this error.

    opened by feanil 0
  • Mysql and Python 3.8

    Mysql and Python 3.8

    I am trying to use dataset with MySQL 8.

    I noticed MySQLDB is not really supported on Python3 and I found mysqlclient, which replaces MySQLDB for Python 3.

    When I run the following:

    import dataset
    db = dataset.connect('mysql://un:[email protected]')
    db.tables()
    

    I get:

    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/home/dataset/venv/lib/python3.8/site-packages/dataset/database.py", line 184, in tables
        return self.inspect.get_table_names(schema=self.schema)
      File "/home/dataset/venv/lib/python3.8/site-packages/sqlalchemy/engine/reflection.py", line 266, in get_table_names
        return self.dialect.get_table_names(
      File "<string>", line 2, in get_table_names
      File "/home/dataset/venv/lib/python3.8/site-packages/sqlalchemy/engine/reflection.py", line 55, in cache
        ret = fn(self, con, *args, **kw)
      File "/home/dataset/venv/lib/python3.8/site-packages/sqlalchemy/dialects/mysql/base.py", line 2786, in get_table_names
        % self.identifier_preparer.quote_identifier(current_schema)
      File "/home/dataset/venv/lib/python3.8/site-packages/sqlalchemy/sql/compiler.py", line 5120, in quote_identifier
        + self._escape_identifier(value)
      File "/home/dataset/venv/lib/python3.8/site-packages/sqlalchemy/sql/compiler.py", line 5079, in _escape_identifier
        value = value.replace(self.escape_quote, self.escape_to_quote)
    AttributeError: 'NoneType' object has no attribute 'replace'
    

    I get this error for basically all commands. Any ideas what this could be?

    opened by ddofborg 7
  • Add selecting json function into find method

    Add selecting json function into find method

    Related #262. Hi pudo, I have implemented selecting json function that can combine with other find parameters. Here are some using examples:

    # Notice: selected key type depends on giving value type,
    #         like if given integer but stored type is float will be automatically transformed to integer.
    # Support operations: >(gt), <(lt), >=(gte), <=(lte), =(==,is), !=(<>, not), between("..")
    # id json_column
    # 0  {"key":-0.5}
    # 1  {"key":0.5}
    # 2  {"key":1.5}
    results = table.find(_json={'json_column':{'key':{'>=': 0.0, '<':1.0}}}) # id = [1]
    results = table.find(_json={'json_column':{'key':{'>=': 0, '<':1}}}) # int(-0.5)==0, id = [0,1]
    
    # id json_column
    # 0  [0,1,2]
    # 1  [0,0.5,1]
    # 2  [0]
    # find rows by index
    results = table.find(_json={'json_column':{1:{'>=': 0.0, '<':1.0}}}) # id = [1]
    
    # id json_column
    # 0  {"key1":{"key2":-1}}
    # 1  {"key1":{"key2":0.5}}
    # find rows by path
    results = table.find(_json={'json_column':{('key1','key2'):{'between':[0.0,1.0]}}}) # id = [1]
    

    However, current sqlite standard library's version does not support this function. So, I have no idea how to add unittest.

    opened by Remalloc 0
Owner
Friedrich Lindenberg
Data and software engineer, investigative support.
Friedrich Lindenberg
Generate database table diagram from SQL data definition.

sql2diagram Generate database table diagram from SQL data definition. e.g. "CREATE TABLE ..." See Example below How does it works? Analyze the SQL to

django-cas-ng 1 Feb 8, 2022
Import entity definition document into SQLie3. Manage the entity. Also, create a "Create Table SQL file".

EntityDocumentMaker Version 1.00 After importing the entity definition (Excel file), store the data in sqlite3. エンティティ定義(Excelファイル)をインポートした後、データをsqlit

G-jon FujiYama 1 Jan 9, 2022
dask-sql is a distributed SQL query engine in python using Dask

dask-sql is a distributed SQL query engine in Python. It allows you to query and transform your data using a mixture of common SQL operations and Python code and also scale up the calculation easily if you need it.

Nils Braun 222 May 17, 2022
Databank is an easy-to-use Python library for making raw SQL queries in a multi-threaded environment.

Databank Databank is an easy-to-use Python library for making raw SQL queries in a multi-threaded environment. No ORM, no frills. Thread-safe. Only ra

snapADDY GmbH 4 Apr 4, 2022
A pandas-like deferred expression system, with first-class SQL support

Ibis: Python data analysis framework for Hadoop and SQL engines Service Status Documentation Conda packages PyPI Azure Coverage Ibis is a toolbox to b

Ibis Project 1.8k May 27, 2022
An extension package of 🤗 Datasets that provides support for executing arbitrary SQL queries on HF datasets

datasets_sql A ?? Datasets extension package that provides support for executing arbitrary SQL queries on HF datasets. It uses DuckDB as a SQL engine

Mario Šaško 6 Mar 24, 2022
Making it easy to query APIs via SQL

Shillelagh Shillelagh (ʃɪˈleɪlɪ) is an implementation of the Python DB API 2.0 based on SQLite (using the APSW library): from shillelagh.backends.apsw

Beto Dealmeida 127 May 15, 2022
dbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL databases.

dbd: database prototyping tool dbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL d

Zdenek Svoboda 36 May 9, 2022
edaSQL is a library to link SQL to Exploratory Data Analysis and further more in the Data Engineering.

edaSQL is a python library to bridge the SQL with Exploratory Data Analysis where you can connect to the Database and insert the queries. The query results can be passed to the EDA tool which can give greater insights to the user.

Tamil Selvan 7 May 19, 2022
Creating a python package to convert /transfer excelsheet data to a mysql Database Table

Creating a python package to convert /transfer excelsheet data to a mysql Database Table

Odiwuor Lameck 1 Jan 7, 2022
Anomaly detection on SQL data warehouses and databases

With CueObserve, you can run anomaly detection on data in your SQL data warehouses and databases. Getting Started Install via Docker docker run -p 300

Cuebook 152 May 19, 2022
Estoult - a Python toolkit for data mapping with an integrated query builder for SQL databases

Estoult Estoult is a Python toolkit for data mapping with an integrated query builder for SQL databases. It currently supports MySQL, PostgreSQL, and

halcyon[nouveau] 13 Feb 25, 2022
Simple DDL Parser to parse SQL (HQL, TSQL, AWS Redshift, Snowflake and other dialects) ddl files to json/python dict with full information about columns: types, defaults, primary keys, etc.

Simple DDL Parser Build with ply (lex & yacc in python). A lot of samples in 'tests/. Is it Stable? Yes, library already has about 5000+ usage per day

Iuliia Volkova 55 May 20, 2022
A simple python package that perform SQL Server Source Control and Auto Deployment.

deploydb Deploy your database objects automatically when the git branch is updated. Production-ready! ⚙️ Easy-to-use ?? Customizable ?? Installation I

Mert Güvençli 9 Apr 3, 2022
SQL for Humans™

Records: SQL for Humans™ Records is a very simple, but powerful, library for making raw SQL queries to most relational databases. Just write SQL. No b

Kenneth Reitz 6.8k May 19, 2022
SQL for Humans™

Records: SQL for Humans™ Records is a very simple, but powerful, library for making raw SQL queries to most relational databases. Just write SQL. No b

Ken Reitz 6.8k May 25, 2022
PyRemoteSQL is a python SQL client that allows you to connect to your remote server with phpMyAdmin installed.

PyRemoteSQL Python MySQL remote client Basically this is a python SQL client that allows you to connect to your remote server with phpMyAdmin installe

ProbablyX 2 Oct 16, 2021
SQL queries to collections

SQC SQL Queries to Collections Examples from sqc import sqc data = [ {"a": 1, "b": 1}, {"a": 2, "b": 1}, {"a": 3, "b": 2}, ] Simple filte

Alexander Volkovsky 1 Oct 21, 2021
Python script to clone SQL dashboard from one workspace to another

Databricks dashboard clone Unofficial project to allow Databricks SQL dashboard copy from one workspace to another. Resource clone Setup: Create a fil

Quentin Ambard 7 Apr 8, 2022