Making it easy to query APIs via SQL

Overview

Shillelagh

https://coveralls.io/repos/github/betodealmeida/shillelagh/badge.svg?branch=master Cirrus CI - Base Branch Build Status Documentation Status

PyPI - Python Version

Shillelagh (ʃɪˈleɪlɪ) is an implementation of the Python DB API 2.0 based on SQLite (using the APSW library):

from shillelagh.backends.apsw.db import connect

connection = connect(":memory:")
cursor = connection.cursor()

query = "SELECT * FROM a_table"
for row in cursor.execute(query):
    print(row)

There is also a SQLAlchemy dialect:

from sqlalchemy.engine import create_engine

engine = create_engine("shillelagh://")
connection = engine.connect()

query = "SELECT * FROM a_table"
for row in connection.execute(query):
    print(row)

And a command-line utility:

$ shillelagh
sql> SELECT * FROM a_table

Installation

Install Shillelagh with pip:

$ pip install 'shillelagh'

This will install an unofficial APSW package from the Python package index. It's highly recommend to install a newer version:

$ pip install https://github.com/rogerbinns/apsw/releases/download/3.36.0-r1/apsw-3.36.0-r1.zip \
--global-option=fetch --global-option=--version --global-option=3.36.0 --global-option=--all \
--global-option=build --global-option=--enable-all-extensions

How is it different?

Shillelagh allows you to easily query non-SQL resources. For example, if you have a Google Spreadsheet you can query it directly as if it were a table in a database:

SELECT country, SUM(cnt)
FROM "https://docs.google.com/spreadsheets/d/1_rN3lm0R_bU3NemO0s9pbFkY5LQPcuy1pscv8ZXPtg8/edit#gid=0"
WHERE cnt > 0
GROUP BY country

You can even run INSERT/DELETE/UPDATE queries against the spreadsheet:

UPDATE "https://docs.google.com/spreadsheets/d/1_rN3lm0R_bU3NemO0s9pbFkY5LQPcuy1pscv8ZXPtg8/edit#gid=0"
SET cnt = cnt + 1
WHERE country != 'BR'

Queries like this are supported by adapters. Currently Shillelagh has the following adapters:

A query can combine data from multiple adapters:

INSERT INTO "/tmp/file.csv"
SELECT time, chance_of_rain
FROM "https://api.weatherapi.com/v1/history.json?q=London"
WHERE time IN (
  SELECT datetime
  FROM "https://docs.google.com/spreadsheets/d/1_rN3lm0R_bU3NemO0s9pbFkY5LQPcuy1pscv8ZXPtg8/edit#gid=1648320094"
)

The query above reads timestamps from a Google sheet, uses them to filter weather data from WeatherAPI, and writes the chance of rain into a (pre-existing) CSV file.

New adapters are relatively easy to implement. There's a step-by-step tutorial that explains how to create a new adapter to an API or filetype.

Comments
  • gsheets error: Unknown pattern

    gsheets error: Unknown pattern "d/m/yyyy".

    It is getting parsed properly in the google sheets, even showing a calendar dropdown in sheets, but getting this error when importing in superset. @betodealmeida

    bug help wanted good first issue 
    opened by harishanumula 15
  • chore(deps-dev): bump requests-cache from 0.7.1 to 0.9.4

    chore(deps-dev): bump requests-cache from 0.7.1 to 0.9.4

    Bumps requests-cache from 0.7.1 to 0.9.4.

    Release notes

    Sourced from requests-cache's releases.

    Improved performance, thread safety, Cache-Control support, and general QoL

    See Changelog for release details

    Conditional requests, custom request matching, and comprehensive user guide

    See Changelog for release details

    Changelog

    Sourced from requests-cache's changelog.

    0.9.4 (2022-04-22)

    • Fix forwarding connection parameters passed to RedisCache for redis-py 4.2 and python <=3.8
    • Fix forwarding connection parameters passed to MongoCache for pymongo 4.1 and python <=3.8

    0.9.3 (2022-02-22)

    • Fix handling BSON serializer differences between pymongo's bson and standalone bson codec.
    • Handle CorruptGridFile error in GridFS backend
    • Fix cache path expansion for user directories (~/...) for SQLite and filesystem backends
    • Fix request normalization for request body with a list as a JSON root
    • Skip normalizing a JSON request body if it's excessively large (>10MB) due to performance impact
    • Fix some thread safety issues:
      • Fix race condition in SQLite backend with dropping and recreating tables in multiple threads
      • Fix race condition in filesystem backend when one thread deletes a file after it's opened but before it is read by a different thread
      • Fix multiple race conditions in GridFS backend

    0.9.2 (2022-02-15)

    • Fix serialization in filesystem backend with binary content that is also valid UTF-8
    • Fix some regression bugs introduced in 0.9.0:
      • Add support for params as a positional argument to CachedSession.request()
      • Add support for disabling expiration for a single request with CachedSession.request(..., expire_after=-1)

    0.9.1 (2022-01-15)

    • Add support for python 3.10.2 and 3.9.10 (regarding resolving ForwardRef types during deserialization)
    • Add support for key-only request parameters (regarding hashing request data for cache key creation)
    • Reduce verbosity of log messages when encountering an invalid JSON request body

    0.9.0 (2022-01-01)

    See all issues and PRs for 0.9

    Expiration & Headers:

    • Use Cache-Control request headers by default
    • Add support for Cache-Control: immutable
    • Add support for immediate expiration + revalidation with Cache-Control: max-age=0 and Expires: 0
    • Reset expiration for cached response when a 304 Not Modified response is received

    Backends:

    • Filesystem and SQLite backends: Add better error message if parent path exists but isn't a directory
    • Redis: Add optional integration with Redis TTL to improve performance for removing expired responses
      • This is enabled by default, but may be disabled

    Other features:

    • Support expire_after param for CachedSession.send()

    Performance:

    • Fix duplicate read operation for checking whether to read from redirects cache
    • Skip unnecessary contains check if a key is in the main responses cache
    • Make per-request expiration thread-safe for both CachedSession.request() and CachedSession.send()
    • Some micro-optimizations for request matching

    ... (truncated)

    Commits
    • 39fb312 Merge pull request #614 from JWCook/parameter-forwarding
    • 259f803 Fix forwarding connection parameters for redis-py 4.2 and pymongo 4.1 on pyth...
    • e85e3c0 Forgot to bump version in init file
    • 1e84136 Change wording a bit on cache header behavior
    • cc5305f Merge pull request #537 from JWCook/concurrency-tests
    • 3c77e2c Update dev dependencies and changelog
    • 3453ae6 Handle CorruptGridFile error in GridFS backend
    • fe82501 Increase stress test multiplier to 10 for pre-deployment tests
    • 0bd1446 Add lock around remove_expired_responses() for SQLite, Filesystem, and GridFS...
    • 9576fcf Improve filesystem backend thread safety
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 7
  • Core google analytics adapter and dialect

    Core google analytics adapter and dialect

    Summary

    Adds adapter and dialect for google analytics universal property, so we can visualize data from GA in superset.

    Note: Universal property of GA will be deprectated sometime next year by google, but other BI solutions like Tableau and PowerBI uses core reporting api V4, so I chose this api for the purpose of connecting superset to GA. And the latest api GA4 is still in beta, so I thought this is the best approach, this can be ported to GA4 property later.

    TODO:

    fill all items in ALL_DIMESIONS and ALL_METRICS in adapter

    and finally add good unit tests :)

    Testing instructions

    Have some data in your google analytics account. Generate service account file for your account from google console Enable analytics api for your account

    Run below script

    from shillelagh.backends.apsw.db import connect
    
    connection = connect(
        "./google_analytics.db",
        adapter_kwargs={
            "googleanalytics": {
                "service_account_info": { DICTIONARY IN YOUR SERVICE ACCOUNT FILE },
                "view_id": "267966216",
            }
        },
    )
    cursor = connection.cursor()
    
    query = " SELECT  `userType`, `newUsers`, `users` FROM 'google_analytics' where date between '2022-08-05' and '2022-08-06' "
    
    for row in cursor.execute(query):
        print(row)
    
    opened by mayurnewase 6
  • Cannot create Google Sheets dataset in Superset

    Cannot create Google Sheets dataset in Superset

    Error adding database from superset with logs below. I'm trying to access the test sheet here https://docs.google.com/spreadsheets/d/1_rN3lm0R_bU3NemO0s9pbFkY5LQPcuy1pscv8ZXPtg8

    8-23 18:20:31,398:INFO:shillelagh.adapters.api.gsheets.adapter:GET https://docs.google.com/spreadsheets/d/1_rN3lm0R_bU3NemO0s9pbFkY5LQPcuy1pscv8ZXPtg8/gviz/tq?gid=0&tq=SELECT%20%2A%20LIMIT%201
    2021-08-23 18:20:31,400:DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): docs.google.com:443
    2021-08-23 18:20:31,604:DEBUG:urllib3.connectionpool:https://docs.google.com:443 "GET /spreadsheets/d/1_rN3lm0R_bU3NemO0s9pbFkY5LQPcuy1pscv8ZXPtg8/gviz/tq?gid=0&tq=SELECT%20%2A%20LIMIT%201 HTTP/1.1" 200 None
    2021-08-23 18:20:31,607:DEBUG:shillelagh.adapters.api.gsheets.adapter:Received payload: {'version': '0.6', 'reqId': '0', 'status': 'ok', 'sig': '1453301915', 'table': {'cols': [{'id': 'A', 'label': 'country', 'type': 'string'}, {'id': 'B', 'label': 'cnt', 'type': 'number', 'pattern': 'General'}], 'rows': [{'c': [{'v': 'BR'}, {'v': 1.0, 'f': '1'}]}], 'parsedNumHeaders': 0}}
    2021-08-23 18:20:31,619:WARNING:shillelagh.backends.apsw.db:Couldn't load adapter
    2021-08-23 18:20:31,621:WARNING:shillelagh.backends.apsw.db:Couldn't load adapter
    ()
    Traceback (most recent call last):
      File "/app/superset/datasets/commands/create.py", line 51, in run
        dataset.fetch_metadata(commit=False)
      File "/app/superset/connectors/sqla/models.py", line 1502, in fetch_metadata
        new_columns = self.external_metadata()
      File "/app/superset/connectors/sqla/models.py", line 666, in external_metadata
        database=self.database, table_name=self.table_name, schema_name=self.schema,
      File "/app/superset/connectors/sqla/utils.py", line 47, in get_physical_table_metadata
        raise NoSuchTableError
    sqlalchemy.exc.NoSuchTableError: ()
    2021-08-23 18:20:31,622:WARNING:superset.datasets.commands.create:()
    Traceback (most recent call last):
      File "/app/superset/datasets/commands/create.py", line 51, in run
        dataset.fetch_metadata(commit=False)
      File "/app/superset/connectors/sqla/models.py", line 1502, in fetch_metadata
        new_columns = self.external_metadata()
      File "/app/superset/connectors/sqla/models.py", line 666, in external_metadata
        database=self.database, table_name=self.table_name, schema_name=self.schema,
      File "/app/superset/connectors/sqla/utils.py", line 47, in get_physical_table_metadata
        raise NoSuchTableError
    sqlalchemy.exc.NoSuchTableError: ()
    Error creating model DatasetRestApi: Dataset could not be created.
    Traceback (most recent call last):
      File "/app/superset/datasets/commands/create.py", line 51, in run
        dataset.fetch_metadata(commit=False)
      File "/app/superset/connectors/sqla/models.py", line 1502, in fetch_metadata
        new_columns = self.external_metadata()
      File "/app/superset/connectors/sqla/models.py", line 666, in external_metadata
        database=self.database, table_name=self.table_name, schema_name=self.schema,
      File "/app/superset/connectors/sqla/utils.py", line 47, in get_physical_table_metadata
        raise NoSuchTableError
    sqlalchemy.exc.NoSuchTableError: ()
    
    The above exception was the direct cause of the following exception:
    
    Traceback (most recent call last):
      File "/app/superset/datasets/api.py", line 251, in post
        new_model = CreateDatasetCommand(g.user, item).run()
      File "/app/superset/datasets/commands/create.py", line 65, in run
        raise DatasetCreateFailedError() from ex
    superset.datasets.commands.exceptions.DatasetCreateFailedError: Dataset could not be created.
    2021-08-23 18:20:31,625:ERROR:superset.datasets.api:Error creating model DatasetRestApi: Dataset could not be created.
    Traceback (most recent call last):
      File "/app/superset/datasets/commands/create.py", line 51, in run
        dataset.fetch_metadata(commit=False)
      File "/app/superset/connectors/sqla/models.py", line 1502, in fetch_metadata
        new_columns = self.external_metadata()
      File "/app/superset/connectors/sqla/models.py", line 666, in external_metadata
        database=self.database, table_name=self.table_name, schema_name=self.schema,
      File "/app/superset/connectors/sqla/utils.py", line 47, in get_physical_table_metadata
        raise NoSuchTableError
    sqlalchemy.exc.NoSuchTableError: ()
    
    The above exception was the direct cause of the following exception:
    
    Traceback (most recent call last):
      File "/app/superset/datasets/api.py", line 251, in post
        new_model = CreateDatasetCommand(g.user, item).run()
      File "/app/superset/datasets/commands/create.py", line 65, in run
        raise DatasetCreateFailedError() from ex
    superset.datasets.commands.exceptions.DatasetCreateFailedError: Dataset could not be created.
    

    I'm building superset from a Dockerfile that looks like this

    FROM apache/superset:latest
    
    # Switching to root to install the required packages
    USER root
    
    RUN pip install clickhouse-driver==0.2.0
    RUN pip install clickhouse-sqlalchemy==0.1.6
    RUN pip install shillelagh[gsheetsapi]==1.0.0
    
    # Switching back to using the `superset` user
    USER superset
    
    bug 
    opened by RhysDavis 6
  • Unable to load adapter gsheetsapi

    Unable to load adapter gsheetsapi

    Thank you for this very useful library. I have a simple streamlit python app in which shillelagh fails because gsheetsapi adapter is not getting loaded. There is probably some similarity to a past issue https://github.com/betodealmeida/shillelagh/issues/112 so I set the version to 1.0.1, but without success.

    requirements.txt

    streamlit
    shillelagh[gsheetaspi]==1.0.1
    pandas
    
    

    Error message:

      File "/app/gsheetdb_tutorial/math_db_ui.py", line 52, in establish_connection
        "gsheetsapi": {"service_account_file": service_account_file}})
    
      File "/home/appuser/venv/lib/python3.7/site-packages/shillelagh/backends/apsw/db.py", line 517, in connect
        adapter_kwargs = {mapping[k]: v for k, v in adapter_kwargs.items()}
    
      File "/home/appuser/venv/lib/python3.7/site-packages/shillelagh/backends/apsw/db.py", line 517, in <dictcomp>
        adapter_kwargs = {mapping[k]: v for k, v in adapter_kwargs.items()}
    
    KeyError: 'gsheetsapi'
    

    Can you please help?

    opened by code-anjali 4
  • Add support for using Google Application Default Credentials for Google Sheets

    Add support for using Google Application Default Credentials for Google Sheets

    This allows for simple authentication when running in an environment where app default credentials are enabled (e.g. Google Cloud Functions, Cloud Run, or on a local development machine where gcloud auth application-default login has been run).

    opened by bobcallaway 4
  • Tests failing due to missing API key for weather API

    Tests failing due to missing API key for weather API

    Look like something wrong with the SHILLELAGH_ADAPTER_KWARGS env var. Either he env var is missing entirely or you are missing the api_key for the weather api.

    opened by cancan101 3
  • Potential Issue with exact on Fields

    Potential Issue with exact on Fields

    The docs indicate that exact=True means no post filtering is needed: https://github.com/betodealmeida/shillelagh/blob/97197bd564e96a23c5587be5c9e315f7c0e693ea/src/shillelagh/fields.py#L121-L125 however that value seems to be passed to apsw: https://github.com/betodealmeida/shillelagh/blob/14579e4b8c3159adc4076b36638d13f00dc70609/src/shillelagh/backends/apsw/vt.py#L305 and the docs for that indicate that False means sqlalachemy won't do any double checking : https://github.com/betodealmeida/shillelagh/blob/14579e4b8c3159adc4076b36638d13f00dc70609/src/shillelagh/typing.py#L16-L17 (also consistent with the apsw docs page)

    Am I misunderstanding something here?

    bug 
    opened by cancan101 3
  • Unable to connect to Shillelagh using SQLAlchemy URI in Superset

    Unable to connect to Shillelagh using SQLAlchemy URI in Superset

    I want to use Shillelagh for querying local csv file. Steps followed:

    1. pip install "shillelagh"
    2. Even installed using $ pip install https://github.com/rogerbinns/apsw/releases/download/3.38.1-r1/apsw-3.38.1-r1.zip
      --global-option=fetch --global-option=--version --global-option=3.38.1 --global-option=--all
      --global-option=build --global-option=--enable-all-extensions
    3. Have already deployed Superset and when trying to connect a new database, entering sqlalchemy uri as "shillelagh://" and clicking on connect gives out an error: An error occurred while creating databases: (sqlalchemy_uri) APSWDialect cannot be used as a data source for security reasons.

    Have already tried changing config.py to PREVENT_UNSAFE_DB_CONNECTIONS = False, still it gives out the same error

    On Terminal: gsheets works on terminal. But when trying to use it for querying csv sql>SELECT * FROM "tips.csv"; I get below errors:

    `Unhandled exception in event loop: File "/usr/lib/python3.8/asyncio/events.py", line 81, in _run self._context.run(self._callback, *self._args) File "/home/piyush/.local/lib/python3.8/site-packages/prompt_toolkit/input/vt100.py", line 173, in callback_wrapper callback() File "/home/piyush/.local/lib/python3.8/site-packages/prompt_toolkit/application/application.py", line 710, in read_from_input self.key_processor.process_keys() File "/home/piyush/.local/lib/python3.8/site-packages/prompt_toolkit/key_binding/key_processor.py", line 270, in process_keys self._process_coroutine.send(key_press) File "/home/piyush/.local/lib/python3.8/site-packages/prompt_toolkit/key_binding/key_processor.py", line 185, in _process self._call_handler(matches[-1], key_sequence=buffer[:]) File "/home/piyush/.local/lib/python3.8/site-packages/prompt_toolkit/key_binding/key_processor.py", line 320, in _call_handler handler.call(event) File "/home/piyush/.local/lib/python3.8/site-packages/prompt_toolkit/key_binding/key_bindings.py", line 124, in call result = self.handler(event) File "/home/piyush/.local/lib/python3.8/site-packages/prompt_toolkit/shortcuts/prompt.py", line 807, in _accept_input self.default_buffer.validate_and_handle() File "/home/piyush/.local/lib/python3.8/site-packages/prompt_toolkit/buffer.py", line 1877, in validate_and_handle self.append_to_history() File "/home/piyush/.local/lib/python3.8/site-packages/prompt_toolkit/buffer.py", line 1385, in append_to_history self.history.append_string(self.text) File "/home/piyush/.local/lib/python3.8/site-packages/prompt_toolkit/history.py", line 74, in append_string self.store_string(string) File "/home/piyush/.local/lib/python3.8/site-packages/prompt_toolkit/history.py", line 294, in store_string with open(self.filename, "ab") as f:

    Exception [Errno 2] No such file or directory: '/home/piyush/.config/shillelagh/shillelagh.history' Press ENTER to continue...Traceback (most recent call last): File "/home/piyush/.local/lib/python3.8/site-packages/shillelagh/backends/apsw/db.py", line 220, in execute self._cursor.execute(operation, parameters) apsw.SQLError: SQLError: no such table: /home/piyush/Downloads/tips.csv

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last): File "/home/piyush/.local/bin/shillelagh", line 8, in sys.exit(main()) File "/home/piyush/.local/lib/python3.8/site-packages/shillelagh/console.py", line 210, in main cursor.execute(sql) File "/home/piyush/.local/lib/python3.8/site-packages/shillelagh/backends/apsw/db.py", line 81, in wrapper return method(self, *args, **kwargs) File "/home/piyush/.local/lib/python3.8/site-packages/shillelagh/backends/apsw/db.py", line 231, in execute self._create_table(uri) File "/home/piyush/.local/lib/python3.8/site-packages/shillelagh/backends/apsw/db.py", line 271, in _create_table self._cursor.execute( File "/home/piyush/.local/lib/python3.8/site-packages/shillelagh/backends/apsw/vt.py", line 244, in Create adapter = self.adapter(*deserialized_args) File "/home/piyush/.local/lib/python3.8/site-packages/shillelagh/adapters/file/csvfile.py", line 106, in init column_names = next(reader) ValueError: could not convert string to float: 'total_bill'`

    answered 
    opened by Piyushency 2
  • Didn't manage to apply query on public google spreadsheet

    Didn't manage to apply query on public google spreadsheet

    Good day everyone! Currently I'm using shillelagh package to access my google spreadsheet. But I'm getting error:
    Unsupported table: https://docs.google.com/spreadsheets/d/1Z8rDIh5Xu1QbrMU7afdT3kQB-75uHXwZ76AVJhTDH4w/edit#gid=1732117799. I'm leaving here link:https://docs.google.com/spreadsheets/d/1Z8rDIh5Xu1QbrMU7afdT3kQB-75uHXwZ76AVJhTDH4w/edit#gid=1732117799, so you can see where is my mistake. Code I used:

    cursor = connection.cursor()
    mydf = pd.read_csv('C:\\Users\\R.Abdrakhmanov\\Desktop\\diploma_at_drive\\data\\ways.csv')
    SQL = """
        SELECT *
        FROM "https://docs.google.com/spreadsheets/d/1Z8rDIh5Xu1QbrMU7afdT3kQB-75uHXwZ76AVJhTDH4w/edit#gid=1732117799"
    """
    rows = cursor.execute(SQL)```
    answered 
    opened by Renat2001 2
  • Queries that Return Zero Rows Raise Exception

    Queries that Return Zero Rows Raise Exception

    e.g. if you run:

    from sqlalchemy.engine import create_engine
    
    engine = create_engine("shillelagh://")
    connection = engine.connect()
    
    query = """SELECT country, SUM(cnt)
    FROM "https://docs.google.com/spreadsheets/d/1_rN3lm0R_bU3NemO0s9pbFkY5LQPcuy1pscv8ZXPtg8/edit#gid=0"
    WHERE cnt > 10
    GROUP BY country"""
    for row in connection.execute(query):
        print(row)
    

    you get:

    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/Users/alex/.pyenv/versions/my-app/lib/python3.10/site-packages/sqlalchemy/engine/result.py", line 964, in __iter__
        return self._iter_impl()
      File "/Users/alex/.pyenv/versions/my-app/lib/python3.10/site-packages/sqlalchemy/engine/result.py", line 637, in _iter_impl
        return self._iterator_getter(self)
      File "/Users/alex/.pyenv/versions/my-app/lib/python3.10/site-packages/sqlalchemy/util/langhelpers.py", line 1180, in __get__
        obj.__dict__[self.__name__] = result = self.fget(obj)
      File "/Users/alex/.pyenv/versions/my-app/lib/python3.10/site-packages/sqlalchemy/engine/result.py", line 361, in _iterator_getter
        make_row = self._row_getter
      File "/Users/alex/.pyenv/versions/my-app/lib/python3.10/site-packages/sqlalchemy/util/langhelpers.py", line 1180, in __get__
        obj.__dict__[self.__name__] = result = self.fget(obj)
      File "/Users/alex/.pyenv/versions/my-app/lib/python3.10/site-packages/sqlalchemy/engine/result.py", line 320, in _row_getter
        keymap = metadata._keymap
      File "/Users/alex/.pyenv/versions/my-app/lib/python3.10/site-packages/sqlalchemy/engine/cursor.py", line 1208, in _keymap
        self._we_dont_return_rows()
      File "/Users/alex/.pyenv/versions/my-app/lib/python3.10/site-packages/sqlalchemy/engine/cursor.py", line 1189, in _we_dont_return_rows
        util.raise_(
      File "/Users/alex/.pyenv/versions/my-app/lib/python3.10/site-packages/sqlalchemy/util/compat.py", line 207, in raise_
        raise exception
    sqlalchemy.exc.ResourceClosedError: This result object does not return rows. It has been closed automatically
    
    bug help wanted good first issue 
    opened by cancan101 2
  • chore(deps-dev): bump pytest-integration from 0.2.2 to 0.2.3

    chore(deps-dev): bump pytest-integration from 0.2.2 to 0.2.3

    Bumps pytest-integration from 0.2.2 to 0.2.3.

    Changelog

    Sourced from pytest-integration's changelog.

    [0.2.3] - 2022-11-17

    Fixed

    • Fixed the pytest11 entry point; it was using a copied name from the example documentation, rather than properly name it. This would break other plugins that made the same mistake.
    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 0
  • chore(deps-dev): bump requests-cache from 0.7.1 to 0.9.7

    chore(deps-dev): bump requests-cache from 0.7.1 to 0.9.7

    Bumps requests-cache from 0.7.1 to 0.9.7.

    Release notes

    Sourced from requests-cache's releases.

    Improved performance, thread safety, Cache-Control support, and general QoL

    See Changelog for release details

    Conditional requests, custom request matching, and comprehensive user guide

    See Changelog for release details

    Changelog

    Sourced from requests-cache's changelog.

    0.9.7 (2022-10-26)

    Backport compatibility fixes from 1.0:

    • PyInstaller: Fix potential AttributeError due to undetected imports when requests-cache is bundled in a PyInstaller package
    • requests-oauthlib: Add support for header values as bytes for compatibility with OAuth1 features
    • cattrs: Add compatibility with cattrs 22.2
    • python: Add tests to ensure compatibility with python 3.11
    • Fix AttributeError when attempting to unpickle a CachedSession object, and instead disable pickling by raising a NotImplementedError

    Add the following for forwards-compatibility with 1.0:

    • DeprecationWarnings to give an earlier notice for methods deprecated (not removed) in 1.0
    • requests_cache.policy subpackage (will replace requests_cache.cache_control module)
    • BaseCache.contains()
    • BaseCache.delete()
    • BaseCache.filter()
    • CachedSession.settings

    0.9.6 (2022-08-24)

    Backport fixes from 1.0:

    • Remove potentially problematic row count from BaseCache.__str__()
    • Remove upper version constraints for all non-dev dependencies
    • Make dependency specification consistent between PyPI and Conda-Forge packages

    0.9.5 (2022-06-29)

    Backport fixes from 1.0:

    • Fix usage of memory backend with install_cache()
    • Add CachedRequest.path_url property
    • Add compatibility with cattrs 22.1

    0.9.4 (2022-04-22)

    Backport fixes from 1.0:

    • Fix forwarding connection parameters passed to RedisCache for redis-py 4.2 and python <=3.8
    • Fix forwarding connection parameters passed to MongoCache for pymongo 4.1 and python <=3.8

    0.9.3 (2022-02-22)

    • Fix handling BSON serializer differences between pymongo's bson and standalone bson codec.
    • Handle CorruptGridFile error in GridFS backend
    • Fix cache path expansion for user directories (~/...) for SQLite and filesystem backends
    • Fix request normalization for request body with a list as a JSON root
    • Skip normalizing a JSON request body if it's excessively large (>10MB) due to performance impact
    • Fix some thread safety issues:
      • Fix race condition in SQLite backend with dropping and recreating tables in multiple threads
      • Fix race condition in filesystem backend when one thread deletes a file after it's opened but before it is read by a different thread
      • Fix multiple race conditions in GridFS backend

    0.9.2 (2022-02-15)

    • Fix serialization in filesystem backend with binary content that is also valid UTF-8
    • Fix some regression bugs introduced in 0.9.0:
      • Add support for params as a positional argument to CachedSession.request()
      • Add support for disabling expiration for a single request with CachedSession.request(..., expire_after=-1)

    ... (truncated)

    Commits
    • fe4edbf Update changelog
    • ea25c28 Run CI jobs with python 3.11
    • 7ee48d9 Merge pull request #710 from requests-cache/forwards-compat
    • 096b10d Add wrapper properties for settings for partial forwards-compatibility with C...
    • 864afeb Add BaseCache.contains(), delete(), and filter() for forwards-compatibility w...
    • 1894434 Add DeprecationWarning if importing from cache_control module
    • 0035af2 Add policy subpackage for forwards-compatibility with 1.0
    • ae2a169 Add support for header values as bytes
    • 446bbc8 Explicitly disable pickling CachedSession objects
    • 170a627 Fix potential AttributeError due to undetected imports when requests-cache ...
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 0
  • chore(deps-dev): bump tabulate from 0.8.9 to 0.9.0

    chore(deps-dev): bump tabulate from 0.8.9 to 0.9.0

    Bumps tabulate from 0.8.9 to 0.9.0.

    Changelog

    Sourced from tabulate's changelog.

    • 0.9.0: Drop support for Python 2.7, 3.5, 3.6. Migrate to pyproject.toml project layout (PEP 621). New output formats: asciidoc, various *grid and *outline formats. New output features: vertical row alignment, separating lines. New input format: list of dataclasses (Python 3.7 or later). Support infinite iterables as row indices. Improve column width options. Improve support for ANSI escape sequences and document the behavior. Various bug fixes.
    • 0.8.10: Python 3.10 support. Bug fixes. Column width parameter.
    Commits
    • bf58e37 version bump to 0.9.0, update README (Benchmark, Contributors), CHANGELOG
    • fd0a34c Merge pull request #201 from astanin/dev-pep621
    • 0a6554e appveyor: upgrade setuptools before build (should fix UNKNOWN package name)
    • d99d9ae ignore ImportError when importing version number
    • 3e45eac update appveyor.yml to use pyproject.toml instead of setup.py
    • 05e88d2 fix test_cli - change script path, do not import .version if init.py is r...
    • 4e2eeb1 update tox.ini - use a virtual environment to build a source dist from the so...
    • 6e37802 Merge pull request #179 from KOLANYCH-libs:pyproject.toml
    • 9172378 fix tests failing after PR#183 (remove 1 space from the expected values)
    • 930a943 reformat with black, fix flake warnings
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 0
  • Parse SQL to determine `colUsed`

    Parse SQL to determine `colUsed`

    Is your feature request related to a problem? Please describe.

    Adapters could benefit from knowing which columns were requested, so they can fetch less data.

    Describe the solution you'd like

    SQLite has the information on colUsed, but it's not available in apsw (https://github.com/rogerbinns/apsw/issues/278). We could use sqloxide to determine the columns needed and pass the list to adapters, and if apsw eventually implements colUsed we can drop the SQL parsing.

    Describe alternatives you've considered

    This is the alternative. :)

    Additional context

    N/A

    enhancement developer performance blocked 
    opened by betodealmeida 1
  • asyncio-compatible dialect

    asyncio-compatible dialect

    SQLAlchemy 1.4 now supports Python asyncio. In order to take advantage of this, the dialect must be "asyncio-compatible". It would be great to have a version of the base Dialect that can be used for this.

    A discussion for what that entails: https://github.com/sqlalchemy/sqlalchemy/discussions/7854.

    Currently the APSWDialect class subclasses SQLiteDialect, which is not async. There is a SQLiteDialect_aiosqlite that could potentially be used. The goal here is not necessarily to have async operations vis a vis sqlite but rather to allow async operations when connecting to the APIs.

    This might be tricky as you would want this to be an async iterator: https://github.com/betodealmeida/shillelagh/blob/97197bd564e96a23c5587be5c9e315f7c0e693ea/src/shillelagh/backends/apsw/db.py#L221 and then likewise have the get_rows call be an async iterator: https://github.com/betodealmeida/shillelagh/blob/97197bd564e96a23c5587be5c9e315f7c0e693ea/src/shillelagh/backends/apsw/vt.py#L439-L441

    I filed this upstream though I am not 100% sure this is the right way to solve the ask: https://github.com/rogerbinns/apsw/issues/325

    enhancement 
    opened by cancan101 2
  • [Gsheets] parse pattern to determine type

    [Gsheets] parse pattern to determine type

    In this sheet (https://docs.google.com/spreadsheets/d/1_rN3lm0R_bU3NemO0s9pbFkY5LQPcuy1pscv8ZXPtg8/edit#gid=1648320094) the column timeofday is returned as having type datetime (instead of timeofday), even though it's formatted as "Time".

    I hardcoded a solution in https://github.com/betodealmeida/shillelagh/pull/159 to fix the integration tests, but ideally we should parse the pattern in order to determine the type.

    enhancement help wanted 
    opened by betodealmeida 1
Releases(1.1.5)
  • 1.1.5(Dec 9, 2022)

    Two small fixes:

    • Handle dataframes without column names (affects both the Pandas and the HTML table adapters).
    • Support booleans on type inference from data.
    Source code(tar.gz)
    Source code(zip)
  • 1.1.4(Dec 7, 2022)

  • 1.1.3(Nov 17, 2022)

    A small improvement to the generic JSON adapter so it can handle nested fields:

    sql> SELECT NS ->> '$[0]' AS first_ns FROM "https://api.domainsdb.info/v1/domains/search?domain=facebook&zone=com#$.domains[*]" LIMIT 1;
    first_ns
    ----------------------
    aron.ns.cloudflare.com
    sql>
    
    Source code(tar.gz)
    Source code(zip)
  • 1.1.2(Nov 1, 2022)

  • 1.1.1(Oct 27, 2022)

  • 1.1.0(Jul 28, 2022)

    Heya!

    Today we're releasing Shillelagh 1.1.0. There are no breaking changes, but some interesting new additions:

    • Shillelagh now supports DROP TABLE. If you run DROP TABLE on a Google sheet URL, for example, the sheet will be deleted. This is only supported in the adapters where it makes sense. In the future we might also want to support CREATE TABLE.
    • The CLI now supports multi-line statements, which means that statements must be terminated with a semicolon. Before, pressing enter would immediately run the query.
    • The configuration file now uses appdirs for the directory location. if you're on a Mac you need to move your configuration files from ~/.config/shillelagh/ to ~/Library/Application\ Support/shillelagh/.
    • New adapters: there is now an adapter for querying CSV/JSON/Parquet files in S3, and an adapter to query HTML tables from any page.
    • Adapters can now request to handle LIMIT and OFFSET, instead of leaving the work to SQLite. This should greatly improve the performance of adapters, since it can greatly reduce the amount of data than needs to be fetched. Most of the builtin adapters have support for them in 1.1.0.
    • Adapters can now be registered at runtime, similar to how SQLAlchemy does with dialects. Before, adapters had to be registered as entry points.
    • Additional documentation on the architecture, custom dialects and custom fields was added.
    Source code(tar.gz)
    Source code(zip)
  • 1.0.16(Jul 15, 2022)

    This release has no external changes. The only change is how parameters are passed from SQLite to the adapter. The new serialize function uses pickle to allow passing more than just strings to adapters.

    Source code(tar.gz)
    Source code(zip)
  • 1.0.15(Jul 13, 2022)

    Includes a new adapter for S3 files, using S3 Select:

    SELECT * FROM "s3://bucket-name/sample_data.csv";
    

    Supports CSV, JSON, and Parquet.

    Also changes the way integers work, now we manipulate them as strings inside SQLite to prevent overflows.

    Source code(tar.gz)
    Source code(zip)
  • 1.0.14(May 25, 2022)

  • 1.0.13(May 2, 2022)

  • 1.0.0(Aug 19, 2021)

    After almost 1 year of development and 4 of planning, I finally released Shillelagh 1.0. The initial release focused on a stable and flexible API, with a solid adapter (Google Sheets). For the next milestone my plan is to focus on adding more adapters and improving performance.

    Source code(tar.gz)
    Source code(zip)
Owner
Beto Dealmeida
Writing open source software since 2003.
Beto Dealmeida
dask-sql is a distributed SQL query engine in python using Dask

dask-sql is a distributed SQL query engine in Python. It allows you to query and transform your data using a mixture of common SQL operations and Python code and also scale up the calculation easily if you need it.

Nils Braun 271 Dec 30, 2022
Databank is an easy-to-use Python library for making raw SQL queries in a multi-threaded environment.

Databank Databank is an easy-to-use Python library for making raw SQL queries in a multi-threaded environment. No ORM, no frills. Thread-safe. Only ra

snapADDY GmbH 4 Apr 4, 2022
Estoult - a Python toolkit for data mapping with an integrated query builder for SQL databases

Estoult Estoult is a Python toolkit for data mapping with an integrated query builder for SQL databases. It currently supports MySQL, PostgreSQL, and

halcyon[nouveau] 15 Dec 29, 2022
Easy-to-use data handling for SQL data stores with support for implicit table creation, bulk loading, and transactions.

dataset: databases for lazy people In short, dataset makes reading and writing data in databases as simple as reading and writing JSON files. Read the

Friedrich Lindenberg 4.2k Jan 2, 2023
Apache Libcloud is a Python library which hides differences between different cloud provider APIs and allows you to manage different cloud resources through a unified and easy to use API

Apache Libcloud - a unified interface for the cloud Apache Libcloud is a Python library which hides differences between different cloud provider APIs

The Apache Software Foundation 1.9k Dec 25, 2022
Query multiple mongoDB database collections easily

leakscoop Perform queries across multiple MongoDB databases and collections, where the field names and the field content structure in each database ma

bagel 5 Jun 24, 2021
A pandas-like deferred expression system, with first-class SQL support

Ibis: Python data analysis framework for Hadoop and SQL engines Service Status Documentation Conda packages PyPI Azure Coverage Ibis is a toolbox to b

Ibis Project 2.3k Jan 6, 2023
SQL for Humans™

Records: SQL for Humans™ Records is a very simple, but powerful, library for making raw SQL queries to most relational databases. Just write SQL. No b

Kenneth Reitz 6.9k Jan 7, 2023
SQL for Humans™

Records: SQL for Humans™ Records is a very simple, but powerful, library for making raw SQL queries to most relational databases. Just write SQL. No b

Ken Reitz 6.9k Jan 3, 2023
Anomaly detection on SQL data warehouses and databases

With CueObserve, you can run anomaly detection on data in your SQL data warehouses and databases. Getting Started Install via Docker docker run -p 300

Cuebook 171 Dec 18, 2022
Simple DDL Parser to parse SQL (HQL, TSQL, AWS Redshift, Snowflake and other dialects) ddl files to json/python dict with full information about columns: types, defaults, primary keys, etc.

Simple DDL Parser Build with ply (lex & yacc in python). A lot of samples in 'tests/. Is it Stable? Yes, library already has about 5000+ usage per day

Iuliia Volkova 95 Jan 5, 2023
PyRemoteSQL is a python SQL client that allows you to connect to your remote server with phpMyAdmin installed.

PyRemoteSQL Python MySQL remote client Basically this is a python SQL client that allows you to connect to your remote server with phpMyAdmin installe

ProbablyX 3 Nov 4, 2022
A simple python package that perform SQL Server Source Control and Auto Deployment.

deploydb Deploy your database objects automatically when the git branch is updated. Production-ready! ⚙️ Easy-to-use ?? Customizable ?? Installation I

Mert Güvençli 10 Dec 7, 2022
SQL queries to collections

SQC SQL Queries to Collections Examples from sqc import sqc data = [ {"a": 1, "b": 1}, {"a": 2, "b": 1}, {"a": 3, "b": 2}, ] Simple filte

Alexander Volkovsky 0 Jul 6, 2022
edaSQL is a library to link SQL to Exploratory Data Analysis and further more in the Data Engineering.

edaSQL is a python library to bridge the SQL with Exploratory Data Analysis where you can connect to the Database and insert the queries. The query results can be passed to the EDA tool which can give greater insights to the user.

Tamil Selvan 8 Dec 12, 2022
Python script to clone SQL dashboard from one workspace to another

Databricks dashboard clone Unofficial project to allow Databricks SQL dashboard copy from one workspace to another. Resource clone Setup: Create a fil

Quentin Ambard 12 Jan 1, 2023
Some scripts for microsoft SQL server in old version.

MSSQL_Stuff Some scripts for microsoft SQL server which is in old version. Table of content Overview Usage References Overview These script works when

小离 5 Dec 29, 2022
Simplest SQL mapper in Python, probably

SQL MAPPER Basically what it does is: it executes some SQL thru a database connector you fed it, maps it to some model and gives to u. Also it can cre

null 2 Nov 7, 2022
Import entity definition document into SQLie3. Manage the entity. Also, create a "Create Table SQL file".

EntityDocumentMaker Version 1.00 After importing the entity definition (Excel file), store the data in sqlite3. エンティティ定義(Excelファイル)をインポートした後、データをsqlit

G-jon FujiYama 1 Jan 9, 2022