Splitgraph command line client and python library

Overview

Splitgraph

Build status Coverage Status PyPI version Discord chat room Follow

Overview

Splitgraph is a tool for building, versioning and querying reproducible datasets. It's inspired by Docker and Git, so it feels familiar. And it's powered by PostgreSQL, so it works seamlessly with existing tools in the Postgres ecosystem. Use Splitgraph to package your data into self-contained data images that you can share with other Splitgraph instances.

Splitgraph.com, or Splitgraph Cloud, is a public Splitgraph instance where you can share and discover data. It's a Splitgraph peer powered by the Splitgraph Core code in this repository, adding proprietary features like a data catalog, multitenancy, and a distributed SQL proxy.

You can explore 40k+ open datasets in the catalog. You can also connect directly to the Data Delivery Network and query any of the datasets, without installing anything.

To install sgr (the command line client) or a local Splitgraph Engine, see the Installation section of this readme.

Build and Query Versioned, Reproducible Datasets

Splitfiles give you a declarative language, inspired by Dockerfiles, for expressing data transformations in ordinary SQL familiar to any researcher or business analyst. You can reference other images, or even other databases, with a simple JOIN.

When you build data with Splitfiles, you get provenance tracking of the resulting data: it's possible to find out what sources went into every dataset and know when to rebuild it if the sources ever change. You can easily integrate Splitgraph into your existing CI pipelines, to keep your data up-to-date and stay on top of changes to upstream sources.

Splitgraph images are also version-controlled, and you can manipulate them with Git-like operations through a CLI. You can check out any image into a PostgreSQL schema and interact with it using any PostgreSQL client. Splitgraph will capture your changes to the data, and then you can commit them as delta-compressed changesets that you can package into new images.

Splitgraph supports PostgreSQL foreign data wrappers. We call this feature mounting. With mounting, you can query other databases (like PostgreSQL/MongoDB/MySQL) or open data providers (like Socrata) from your Splitgraph instance with plain SQL. You can even snapshot the results or use them in Splitfiles.

Why Splitgraph?

Splitgraph isn't opinionated and doesn't break existing abstractions. To any existing PostgreSQL application, Splitgraph images are just another database. We have carefully designed Splitgraph to not break the abstraction of a PostgreSQL table and wire protocol, because doing otherwise would mean throwing away a vast existing ecosystem of applications, users, libraries and extensions. This means that a lot of tools that work with PostgreSQL work with Splitgraph out of the box.

Components

The code in this repository, known as Splitgraph Core, contains:

  • sgr command line client: sgr is the main command line tool used to work with Splitgraph "images" (data snapshots). Use it to ingest data, work with splitfiles, and push data to Splitgraph.com.
  • Splitgraph Engine: a Docker image of the latest Postgres with Splitgraph and other required extensions pre-installed.
  • Splitgraph Python library: All Splitgraph functionality is available in the Python API, offering first-class support for data science workflows including Jupyter notebooks and Pandas dataframes.

Docs

Documentation is available at https://www.splitgraph.com/docs, specifically:

We also recommend reading our Blog, including some of our favorite posts:

Installation

Pre-requisites:

  • Docker is required to run the Splitgraph Engine. sgr must have access to Docker. You either need to install Docker locally or have access to a remote Docker socket.

For Linux and OSX, once Docker is running, install Splitgraph with a single script:

$ bash -c "$(curl -sL https://github.com/splitgraph/splitgraph/releases/latest/download/install.sh)"

This will download the sgr binary and set up the Splitgraph Engine Docker container.

Alternatively, you can get the sgr single binary from the releases page and run sgr engine add to create an engine.

See the installation guide for more installation methods.

Quick start guide

You can follow the quick start guide that will guide you through the basics of using Splitgraph with public and private data.

Alternatively, Splitgraph comes with plenty of examples to get you started.

If you're stuck or have any questions, check out the documentation or join our Discord channel!

Contributing

Setting up a development environment

  • Splitgraph requires Python 3.6 or later.
  • Install Poetry: curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python to manage dependencies
  • Install pre-commit hooks (we use Black to format code)
  • git clone --recurse-submodules https://github.com/splitgraph/splitgraph.git
  • poetry install
  • To build the engine Docker image: cd engine && make

Running tests

The test suite requires docker-compose. You will also need to add these lines to your /etc/hosts or equivalent:

127.0.0.1       local_engine
127.0.0.1       remote_engine
127.0.0.1       objectstorage

To run the core test suite, do

docker-compose -f test/architecture/docker-compose.core.yml up -d
poetry run pytest -m "not mounting and not example"

To run the test suite related to "mounting" and importing data from other databases (PostgreSQL, MySQL, Mongo), do

docker-compose -f test/architecture/docker-compose.core.yml -f test/architecture/docker-compose.mounting.yml up -d
poetry run pytest -m mounting

Finally, to test the example projects, do

# Example projects spin up their own engines
docker-compose -f test/architecture/docker-compose.core.yml -f test/architecture/docker-compose.core.yml down -v
poetry run pytest -m example

All of these tests run in CI.

Comments
  • Cannot mount Snowflake as a data source behind a proxy

    Cannot mount Snowflake as a data source behind a proxy

    We are evaluating splitgraph in our corporate env, where we have to go through a proxy to reach Snowflake.

    When we do

    sgr --version
    sgr, version 0.2.10
    
    sgr -v DEBUG mount snowflake test_snowflake -o@- <<EOF
    heredoc> {
    heredoc>     "username": "XXX",
    heredoc>     "password": "XXX",
    heredoc>     "account": "XXX",
    heredoc>     "database": "XXX",
    heredoc>     "schema": "XXX"
    heredoc> }
    heredoc> EOF
    

    we get the following error

    error: Traceback (most recent call last):
    error:   File "/home/XXX/python3.7/site-packages/splitgraph/commandline/__init__.py", line 116, in invoke
    error:     result = super(click.Group, self).invoke(ctx)
    error:   File "/home/XXX/python3.7/site-packages/click/core.py", line 1259, in invoke
    error:     return _process_result(sub_ctx.command.invoke(sub_ctx))
    error:   File "/home/XXX/python3.7/site-packages/click/core.py", line 1259, in invoke
    error:     return _process_result(sub_ctx.command.invoke(sub_ctx))
    error:   File "/home/XXX/python3.7/site-packages/click/core.py", line 1066, in invoke
    error:     return ctx.invoke(self.callback, **ctx.params)
    error:   File "/home/XXX/python3.7/site-packages/click/core.py", line 610, in invoke
    error:     return callback(*args, **kwargs)
    error:   File "/home/XXX/python3.7/site-packages/splitgraph/commandline/mount.py", line 69, in _callback
    error:     mount(schema, mount_handler=handler_name, handler_kwargs=handler_options)
    error:   File "/home/XXX/python3.7/site-packages/splitgraph/hooks/mount_handlers.py", line 69, in mount
    error:     source.mount(schema=mountpoint, overwrite=overwrite, tables=tables)
    error:   File "/home/XXX/python3.7/site-packages/splitgraph/hooks/data_source/fdw.py", line 134, in mount
    error:     self._create_foreign_tables(schema, server_id, tables)
    error:   File "/home/XXX/python3.7/site-packages/splitgraph/hooks/data_source/fdw.py", line 144, in _create_foreign_tables
    error:     _import_foreign_schema(self.engine, schema, remote_schema, server_id, tables)
    error:   File "/home/XXX/python3.7/site-packages/splitgraph/hooks/data_source/fdw.py", line 299, in _import_foreign_schema
    error:     engine.run_sql(query)
    error:   File "/home/XXX/python3.7/site-packages/splitgraph/engine/postgres/engine.py", line 501, in run_sql
    error:     cur.execute(statement, _convert_vals(arguments) if arguments else None)
    error: psycopg2.errors.InternalError_: Error in python: OperationalError
    error: DETAIL:  (snowflake.connector.errors.OperationalError) 250003: Failed to execute request: HTTPSConnectionPool(host='XXX.snowflakecomputing.com', port=443): Max retries exceeded with url: /session/v1/login-request?request_id=XXX&databaseName=XXX&schemaName=XXX&request_guid=XXX(Caused by ConnectTimeoutError(<snowflake.connector.vendored.urllib3.connection.HTTPSConnection object at 0x7f31c8e0df98>, 'Connection to XXX.snowflakecomputing.com timed out. (connect timeout=60)'))
    error: (Background on this error at: http://sqlalche.me/e/14/e3q8)
    

    It seems although the snowflake connector Splitgraph uses supports proxy settings via HTTPS_PROXY, HTTP_PROXY and NO_PROXY [1], but it psycopg2 doesn't pass them over while creating new servers [2]?

    [1] https://github.com/splitgraph/snowflake-sqlalchemy#using-a-proxy-server [2] https://github.com/splitgraph/splitgraph/blob/3cc20ef9021c153344cb0e52247dcc9162812d50/splitgraph/hooks/data_source/fdw.py#L259

    opened by mapshen 19
  • strange error with splitfile

    strange error with splitfile

    I completed this demo https://www.splitgraph.com/docs/getting-started/decentralized-demo and am trying out different splitfiles.

    My first attempt is something simple. In an empty splitfile, I added this and works. FROM demo/weather IMPORT rdu AS source_data

    But when I replace that line with something I thought was equivalent, it doesn't work

    # not work
    FROM demo/weather IMPORT {SELECT * FROM rdu} AS source_data
    

    I'm getting this error

    >sgr -v DEBUG build rdu-weather-summary.splitfile 
    Executing Splitfile rdu-weather-summary.splitfile with arguments {}
    
    Step 1/1 : FROM demo/weather IMPORT {SELECT * FROM rdu} AS source_data
    Resolving repository demo/weather
    Gathering remote metadata...
    Fetched metadata for 2 images, 1 table, 0 objects and 1 tag.
    Importing 1 table from demo/weather:b2019b4321c1 into rdu-weather-summary
    debug: Mounting demo/weather_tmp_clone:b2019b4321c116c277ba966435ff1c2ed8f5c037ae0650b6abf3efec7a39984c/rdu into o5e43720645ede717012cd3a3b67bfa4f
    error: Traceback (most recent call last):
    error:   File "/home/projector-user/.pyenv/versions/3.9.5/lib/python3.9/site-packages/splitgraph/splitfile/execution.py", line 47, in _checkout_or_calculate_layer
    error:     output.images.by_hash(image_hash).checkout()
    error:   File "/home/projector-user/.pyenv/versions/3.9.5/lib/python3.9/site-packages/splitgraph/core/image_manager.py", line 119, in by_hash
    error:     raise ImageNotFoundError("No images starting with %s found!" % image_hash)
    error: splitgraph.exceptions.ImageNotFoundError: No images starting with 91a5102a961e4a3c64f0cbdec5099d054aaa07b74ee81829fd0517f688a9be9c found!
    error: 
    error: During handling of the above exception, another exception occurred:
    error: 
    error: Traceback (most recent call last):
    error:   File "/home/projector-user/.pyenv/versions/3.9.5/lib/python3.9/site-packages/splitgraph/commandline/__init__.py", line 114, in invoke
    error:     result = super(click.Group, self).invoke(ctx)
    error:   File "/home/projector-user/.pyenv/versions/3.9.5/lib/python3.9/site-packages/click/core.py", line 1259, in invoke
    error:     return _process_result(sub_ctx.command.invoke(sub_ctx))
    error:   File "/home/projector-user/.pyenv/versions/3.9.5/lib/python3.9/site-packages/click/core.py", line 1066, in invoke
    error:     return ctx.invoke(self.callback, **ctx.params)
    error:   File "/home/projector-user/.pyenv/versions/3.9.5/lib/python3.9/site-packages/click/core.py", line 610, in invoke
    error:     return callback(*args, **kwargs)
    error:   File "/home/projector-user/.pyenv/versions/3.9.5/lib/python3.9/site-packages/splitgraph/commandline/splitfile.py", line 57, in build_c
    error:     execute_commands(splitfile.read(), args, output=output_repository)
    error:   File "/home/projector-user/.pyenv/versions/3.9.5/lib/python3.9/site-packages/splitgraph/splitfile/execution.py", line 208, in execute_commands
    error:     provenance_line = _execute_import(node, output)
    error:   File "/home/projector-user/.pyenv/versions/3.9.5/lib/python3.9/site-packages/splitgraph/splitfile/execution.py", line 336, in _execute_import
    error:     return _execute_repo_import(
    error:   File "/home/projector-user/.pyenv/versions/3.9.5/lib/python3.9/site-packages/splitgraph/splitfile/execution.py", line 434, in _execute_repo_import
    error:     _checkout_or_calculate_layer(target_repository, target_hash, _calc)
    error:   File "/home/projector-user/.pyenv/versions/3.9.5/lib/python3.9/site-packages/splitgraph/splitfile/execution.py", line 51, in _checkout_or_calculate_layer
    error:     calc_func()
    error:   File "/home/projector-user/.pyenv/versions/3.9.5/lib/python3.9/site-packages/splitgraph/splitfile/execution.py", line 424, in _calc
    error:     target_repository.import_tables(
    error:   File "/home/projector-user/.pyenv/versions/3.9.5/lib/python3.9/site-packages/splitgraph/core/common.py", line 139, in wrapped
    error:     return func(self, *args, **kwargs)
    error:   File "/home/projector-user/.pyenv/versions/3.9.5/lib/python3.9/site-packages/splitgraph/core/repository.py", line 782, in import_tables
    error:     return self._import_tables(
    error:   File "/home/projector-user/.pyenv/versions/3.9.5/lib/python3.9/site-packages/splitgraph/core/repository.py", line 831, in _import_tables
    error:     self._import_new_table(
    error:   File "/home/projector-user/.pyenv/versions/3.9.5/lib/python3.9/site-packages/splitgraph/core/repository.py", line 898, in _import_new_table
    error:     self.object_engine.run_sql_in(
    error:   File "/home/projector-user/.pyenv/versions/3.9.5/lib/python3.9/site-packages/splitgraph/engine/__init__.py", line 157, in run_sql_in
    error:     result = self.run_sql(sql, arguments, return_shape=return_shape)
    error:   File "/home/projector-user/.pyenv/versions/3.9.5/lib/python3.9/site-packages/splitgraph/engine/postgres/engine.py", line 516, in run_sql
    error:     cur.execute(statement, arguments)
    error: psycopg2.errors.InternalError_: Error in python: KeyError
    error: DETAIL:  'splitgraph'
    error: 
    
    opened by mingfang 11
  • Configuration file requires write access

    Configuration file requires write access

    I wish to use Splitgraph on a JupyterHub instance. I'd like to provide all users with the same configuration. So I created a file at /etc/xdg/splitgraph/sgconfig.ini and set the SG_CONFIG_FILE environment variable to point to this.

    However, Splitgraph requires write access to this file. That's a bit odd for a system-wide configuration file! Would it be possible for Splitgraph to write to the user file rather than the system one? Or is this kind of system-wide configuration just not supported/recommended?

    /opt/tljh/user/lib/python3.6/site-packages/splitgraph/cloud/__init__.py in access_token(self)
        519 
        520         set_in_subsection(config, "remotes", self.remote, "SG_CLOUD_ACCESS_TOKEN", new_access_token)
    --> 521         overwrite_config(config, get_singleton(config, "SG_CONFIG_FILE"))
        522         return new_access_token
        523 
    
    /opt/tljh/user/lib/python3.6/site-packages/splitgraph/config/export.py in overwrite_config(new_config, config_path, include_defaults)
        105         new_config, config_format=True, no_shielding=True, include_defaults=include_defaults
        106     )
    --> 107     with open(config_path, "w") as f:
        108         f.write(new_config_data)
    
    PermissionError: [Errno 13] Permission denied: '/etc/xdg/splitgraph/sgconfig.ini'
    
    opened by harrybiddle 5
  • Splitfile build error - error: NotImplementedError: Printer for node 'TableLikeClause' is not implemented yet

    Splitfile build error - error: NotImplementedError: Printer for node 'TableLikeClause' is not implemented yet

    I'm running the splitfile below after sgr csv import <source file>.

    FROM ${REPOSITORY}
    
    SQL {
        CREATE TABLE IF NOT EXISTS ${TABLE}
        (LIKE "${SOURCE}".${TABLE}
        INCLUDING ALL)
    }
    

    But it's printing the error below.

    error: NotImplementedError: Printer for node 'TableLikeClause' is not implemented yet
    

    I'm using this CTA syntax to automatically copy the source primary index, which I'm hoping was created by

    sgr csv import --primary-key my_key
    
    opened by mingfang 5
  • Five minute demo doesn't work

    Five minute demo doesn't work

    I followed all the steps of the "Five minute demo" but it doesn't work. When I execute the following command I get an error.

    sgr sql -s splitgraph/2016_election "SELECT \
      candidate_normalized, SUM(votes) FROM precinct_results \
      WHERE county_fips=11001 GROUP BY candidate_normalized"
    

    Error:

    (splitgraph) levi_leal@hake:~/workspace/POC/splitgraph$ sgr sql -s splitgraph/2016_election "SELECT \
    >   candidate_normalized, SUM(votes) FROM precinct_results \
    >   WHERE county_fips=11001 GROUP BY candidate_normalized"
    error: psycopg2.errors.InternalError_: Error in python: ObjectCacheError
    error: DETAIL:  Not all objects required for splitgraph/2016_election:3835145ada3f07cad99087d1b1071122d58c48783cbfe4694c101d35651fba90:precinct_results have been fetched. Missing 2 objects (oaf3cca3210e1903f06872d0041097dd284423958352d5cd7b5209eceec6cbf,o974b20261ee5f1ac124a8445a65c00e6377cfabb93b5db92bd60fc7ca3fcee)
    

    I noticed the problem only occurs when I use the layered option on the checkout.

    sgr checkout --layered splitgraph/2016_election:latest
    

    When I use the checkout without the layered flag it works properly

    (splitgraph) levi_leal@hake:~/workspace/POC/splitgraph$ sgr checkout splitgraph/2016_election:latest
    Need to download 20 objects (26.81 MiB), cache occupancy: 0.00 B/10.00 GiB
    Fetching 20 objects, total size 26.81 MiB
    Getting download URLs from registry PostgresEngine data.splitgraph.com ([email protected]:5432/sgregistry)...
    100%|██████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:03<00:00,  6.64obj/s, object=o1ccf32547...]
    100%|██████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:23<00:00,  1.16s/obj, object=o974b20261...]
    Checked out splitgraph/2016_election:3835145ada3f.
    (splitgraph) levi_leal@hake:~/workspace/POC/splitgraph$ sgr sql -s splitgraph/2016_election "SELECT \
    >   candidate_normalized, SUM(votes) FROM precinct_results \
    >   WHERE county_fips=11001 GROUP BY candidate_normalized"
    clinton  282830
    in         6551
    johnson    4906
    stein      4258
    trump     12723
    
    opened by leviplj 5
  • Bump pre-commit from 1.18.2 to 1.18.3

    Bump pre-commit from 1.18.2 to 1.18.3

    Bumps pre-commit from 1.18.2 to 1.18.3.

    Release notes

    Sourced from pre-commit's releases.

    pre-commit v1.18.3

    Fixes

    Changelog

    Sourced from pre-commit's changelog.

    1.18.3 - 2019-08-27

    Fixes

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language
    • @dependabot badge me will comment on this PR with code to add a "Dependabot enabled" badge to your readme

    Additionally, you can set the following in your Dependabot dashboard:

    • Update frequency (including time of day and day of week)
    • Automerge options (never/patch/minor, and dev/runtime dependencies)
    • Pull request limits (per update run and/or open at any time)
    • Out-of-range updates (receive only lockfile updates, if desired)
    • Security updates (receive only security updates, if desired)

    Finally, you can contact us by mentioning @dependabot.

    dependencies 
    opened by dependabot-preview[bot] 5
  • Changed pg_dump/restore shelling out to using postgres' COPY TO/FROM

    Changed pg_dump/restore shelling out to using postgres' COPY TO/FROM

    Removes the binary dependency on pg_dump/restore. Much faster for lots of small objects, possibly slightly slower for large objects. Not sure at which point the overhead from gzipping beats the overhead from larger objects.

    Benchmarks

    Uploading 570 small-ish objects (<20 rows each) upstream (dumping + minio):

    BEFORE: bulk of time spent farming out to pg_dump (startup takes about 0.8s)

    Thu Nov 22 11:36:58 2018    push_s3_current.cprofile
             1627726 function calls (1627725 primitive calls) in 524.916 seconds
       Ordered by: cumulative time
       List reduced from 567 to 20 due to restriction <20>
       ncalls  tottime  percall  cumtime  percall filename:lineno(function)
            1    0.000    0.000  524.926  524.926 {built-in method builtins.exec}
            1    0.000    0.000  524.926  524.926 <string>:1(<module>)
            1    0.001    0.001  524.925  524.925 /home/mildbyte/splitgraph-opensource/splitgraph/commands/push_pull.py:176(push)
            1    0.002    0.002  522.233  522.233 /home/mildbyte/splitgraph-opensource/splitgraph/objects/loading.py:83(upload_objects)
            1    0.032    0.032  522.221  522.221 /home/mildbyte/splitgraph-opensource/splitgraph/objects/s3.py:20(s3_upload_objects)
          570    0.144    0.000  508.173    0.892 /home/mildbyte/splitgraph-opensource/splitgraph/objects/dumping.py:14(dump_object_to_file)
          570    0.028    0.000  507.636    0.891 /home/mildbyte/miniconda3/envs/splitgraph-prototype/lib/python3.6/subprocess.py:295(check_output)
          570    0.052    0.000  507.604    0.891 /home/mildbyte/miniconda3/envs/splitgraph-prototype/lib/python3.6/subprocess.py:372(run)
          570    0.026    0.000  500.243    0.878 /home/mildbyte/miniconda3/envs/splitgraph-prototype/lib/python3.6/subprocess.py:803(communicate)
         1144  500.060    0.437  500.060    0.437 {method 'read' of '_io.BufferedReader' objects}
          570    0.043    0.000   13.967    0.025 /home/mildbyte/miniconda3/envs/splitgraph-prototype/lib/python3.6/site-packages/minio-4.0.6-py3.6.egg/minio/api.py:522(fput_object)
          570    0.031    0.000   13.871    0.024 /home/mildbyte/miniconda3/envs/splitgraph-prototype/lib/python3.6/site-packages/minio-4.0.6-py3.6.egg/minio/api.py:723(put_object)
          570    0.026    0.000   13.777    0.024 /home/mildbyte/miniconda3/envs/splitgraph-prototype/lib/python3.6/site-packages/minio-4.0.6-py3.6.egg/minio/api.py:1420(_do_put_object)
          570    0.064    0.000   13.668    0.024 /home/mildbyte/miniconda3/envs/splitgraph-prototype/lib/python3.6/site-packages/minio-4.0.6-py3.6.egg/minio/api.py:1762(_url_open)
          572    0.037    0.000   12.778    0.022 /home/mildbyte/miniconda3/envs/splitgraph-prototype/lib/python3.6/site-packages/urllib3/poolmanager.py:301(urlopen)
          572    0.057    0.000   12.513    0.022 /home/mildbyte/miniconda3/envs/splitgraph-prototype/lib/python3.6/site-packages/urllib3/connectionpool.py:446(urlopen)
          572    0.103    0.000   11.804    0.021 /home/mildbyte/miniconda3/envs/splitgraph-prototype/lib/python3.6/site-packages/urllib3/connectionpool.py:319(_make_request)
          572    0.037    0.000   10.471    0.018 /home/mildbyte/miniconda3/envs/splitgraph-prototype/lib/python3.6/http/client.py:1287(getresponse)
          572    0.060    0.000   10.242    0.018 /home/mildbyte/miniconda3/envs/splitgraph-prototype/lib/python3.6/http/client.py:290(begin)
          572    0.038    0.000    9.507    0.017 /home/mildbyte/miniconda3/envs/splitgraph-prototype/lib/python3.6/http/client.py:257(_read_status)
    
    

    AFTER: I think it's possible to make this ever so slightly faster by piping directly to Minio (currently we still dump files to /tmp for a 9.5s cost in this trace) + possibly running multiple jobs in parallel.

       ncalls  tottime  percall  cumtime  percall filename:lineno(function)
            1    0.000    0.000   28.853   28.853 {built-in method builtins.exec}
            1    0.001    0.001   28.853   28.853 <string>:1(<module>)
            1    0.001    0.001   28.849   28.849 /home/mildbyte/splitgraph-opensource/splitgraph/commands/push_pull.py:176(push)
            1    0.001    0.001   25.690   25.690 /home/mildbyte/splitgraph-opensource/splitgraph/objects/loading.py:83(upload_objects)
            1    0.017    0.017   25.687   25.687 /home/mildbyte/splitgraph-opensource/splitgraph/objects/s3.py:20(s3_upload_objects)
          570    0.029    0.000   16.058    0.028 /home/mildbyte/miniconda3/envs/splitgraph-prototype/lib/python3.6/site-packages/minio-4.0.6-py3.6.egg/minio/api.py:522(fput_object)
          570    0.023    0.000   15.989    0.028 /home/mildbyte/miniconda3/envs/splitgraph-prototype/lib/python3.6/site-packages/minio-4.0.6-py3.6.egg/minio/api.py:723(put_object)
          570    0.018    0.000   15.915    0.028 /home/mildbyte/miniconda3/envs/splitgraph-prototype/lib/python3.6/site-packages/minio-4.0.6-py3.6.egg/minio/api.py:1420(_do_put_object)
          570    0.029    0.000   15.854    0.028 /home/mildbyte/miniconda3/envs/splitgraph-prototype/lib/python3.6/site-packages/minio-4.0.6-py3.6.egg/minio/api.py:1762(_url_open)
          572    0.020    0.000   15.272    0.027 /home/mildbyte/miniconda3/envs/splitgraph-prototype/lib/python3.6/site-packages/urllib3/poolmanager.py:301(urlopen)
          572    0.042    0.000   15.129    0.026 /home/mildbyte/miniconda3/envs/splitgraph-prototype/lib/python3.6/site-packages/urllib3/connectionpool.py:446(urlopen)
          572    0.049    0.000   14.666    0.026 /home/mildbyte/miniconda3/envs/splitgraph-prototype/lib/python3.6/site-packages/urllib3/connectionpool.py:319(_make_request)
          572    0.013    0.000   14.029    0.025 /home/mildbyte/miniconda3/envs/splitgraph-prototype/lib/python3.6/http/client.py:1287(getresponse)
          572    0.040    0.000   13.952    0.024 /home/mildbyte/miniconda3/envs/splitgraph-prototype/lib/python3.6/http/client.py:290(begin)
          572    0.030    0.000   13.381    0.023 /home/mildbyte/miniconda3/envs/splitgraph-prototype/lib/python3.6/http/client.py:257(_read_status)
         6298    0.060    0.000   13.371    0.002 {method 'readline' of '_io.BufferedReader' objects}
          572    0.009    0.000   13.311    0.023 /home/mildbyte/miniconda3/envs/splitgraph-prototype/lib/python3.6/socket.py:572(readinto)
          572   13.297    0.023   13.297    0.023 {method 'recv_into' of '_socket.socket' objects}
         2319   10.242    0.004   10.522    0.005 {method 'execute' of 'psycopg2.extensions.cursor' objects}
          570    0.097    0.000    9.574    0.017 /home/mildbyte/splitgraph-opensource/splitgraph/objects/dumping.py:14(dump_object_to_file)
    

    Clone + download all objects: 40-45s

    AFTER WITH GZIP: basically the same

    Thu Nov 22 14:07:04 2018    push_s3_copy_to_gzip.cprofile
             1205720 function calls (1205719 primitive calls) in 28.582 seconds
       Ordered by: cumulative time
       List reduced from 579 to 20 due to restriction <20>
       ncalls  tottime  percall  cumtime  percall filename:lineno(function)
            1    0.000    0.000   28.600   28.600 {built-in method builtins.exec}
            1    0.001    0.001   28.600   28.600 <string>:1(<module>)
            1    0.001    0.001   28.599   28.599 /home/mildbyte/splitgraph-opensource/splitgraph/commands/push_pull.py:176(push)
            1    0.001    0.001   25.588   25.588 /home/mildbyte/splitgraph-opensource/splitgraph/objects/loading.py:83(upload_objects)
            1    0.018    0.018   25.583   25.583 /home/mildbyte/splitgraph-opensource/splitgraph/objects/s3.py:20(s3_upload_objects)
          570    0.027    0.000   16.407    0.029 /home/mildbyte/miniconda3/envs/splitgraph-prototype/lib/python3.6/site-packages/minio-4.0.6-py3.6.egg/minio/api.py:522(fput_object)
          570    0.019    0.000   16.343    0.029 /home/mildbyte/miniconda3/envs/splitgraph-prototype/lib/python3.6/site-packages/minio-4.0.6-py3.6.egg/minio/api.py:723(put_object)
          570    0.018    0.000   16.288    0.029 /home/mildbyte/miniconda3/envs/splitgraph-prototype/lib/python3.6/site-packages/minio-4.0.6-py3.6.egg/minio/api.py:1420(_do_put_object)
          570    0.026    0.000   16.223    0.028 /home/mildbyte/miniconda3/envs/splitgraph-prototype/lib/python3.6/site-packages/minio-4.0.6-py3.6.egg/minio/api.py:1762(_url_open)
          572    0.018    0.000   15.687    0.027 /home/mildbyte/miniconda3/envs/splitgraph-prototype/lib/python3.6/site-packages/urllib3/poolmanager.py:301(urlopen)
          572    0.037    0.000   15.552    0.027 /home/mildbyte/miniconda3/envs/splitgraph-prototype/lib/python3.6/site-packages/urllib3/connectionpool.py:446(urlopen)
          572    0.048    0.000   15.055    0.026 /home/mildbyte/miniconda3/envs/splitgraph-prototype/lib/python3.6/site-packages/urllib3/connectionpool.py:319(_make_request)
          572    0.018    0.000   14.370    0.025 /home/mildbyte/miniconda3/envs/splitgraph-prototype/lib/python3.6/http/client.py:1287(getresponse)
          572    0.034    0.000   14.264    0.025 /home/mildbyte/miniconda3/envs/splitgraph-prototype/lib/python3.6/http/client.py:290(begin)
          572    0.033    0.000   13.693    0.024 /home/mildbyte/miniconda3/envs/splitgraph-prototype/lib/python3.6/http/client.py:257(_read_status)
         6298    0.063    0.000   13.686    0.002 {method 'readline' of '_io.BufferedReader' objects}
          572    0.011    0.000   13.623    0.024 /home/mildbyte/miniconda3/envs/splitgraph-prototype/lib/python3.6/socket.py:572(readinto)
          572   13.607    0.024   13.607    0.024 {method 'recv_into' of '_socket.socket' objects}
         2319    9.451    0.004    9.703    0.004 {method 'execute' of 'psycopg2.extensions.cursor' objects}
          570    0.053    0.000    9.118    0.016 /home/mildbyte/splitgraph-opensource/splitgraph/objects/dumping.py:16(dump_object_to_file)
    

    Large object benchmarks

    1M SNAP, 10x 1000-row DIFFs:

    Sizes: SNAP 82MiB, DIFF 116KiB (pg_dump/restore was 39MiB/45KiB), 41MiB/45KiB with gzipping

    Push: 15s (pg_dump/restore was 30s), 45s with gzipping (?????)

    Clone + download all: 20s (pg_dump/restore was 15s), 20s with gzipping

    100K SNAP, 10x 1000-row DIFFs, with gzipping

    Sizes: 4.1MiB/45KiB

    Push: 6s (was 8s, probably comparable)

    Clone + download all: 3s (was 4s, probably comparable)

    opened by mildbyte 5
  • Bump pyinstaller from 4.2 to 4.3

    Bump pyinstaller from 4.2 to 4.3

    Bumps pyinstaller from 4.2 to 4.3.

    Release notes

    Sourced from pyinstaller's releases.

    v4.3

    Please see the changelog if you wish to see a full list of changes.

    Changelog

    Sourced from pyinstaller's changelog.

    4.3 (2021-04-16)

    Features

    • Provide basic implementation for FrozenImporter.get_source() that allows reading source from .py files that are collected by hooks as data files. (#5697)
    • Raise the maximum allowed size of CArchive (and consequently onefile executables) from 2 GiB to 4 GiB. (#3939)
    • The unbuffered stdio mode (the u option) now sets the Py_UnbufferedStdioFlag flag to enable unbuffered stdio mode in Python library. (#1441)
    • Windows: Set EXE checksums. Reduces false-positive detection from antiviral software. (#5579)
    • Add new command-line options that map to collect functions from hookutils: --collect-submodules, --collect-data, --collect-binaries, --collect-all, and --copy-metadata. (#5391)
    • Add new hook utility ~PyInstaller.utils.hooks.collect_entry_point for collecting plugins defined through setuptools entry points. (#5734)

    Bugfix

    • (macOS) Fix Bad CPU type in executable error in helper-spawned python processes when running under arm64-only flavor of Python on Apple M1. (#5640)
    • (OSX) Suppress missing library error messages for system libraries as those are never collected by PyInstaller and starting with Big Sur, they are hidden by the OS. (#5107)
    • (Windows) Change default cache directory to LOCALAPPDATA (from the original APPDATA). This is to make sure that cached data doesn't get synced with the roaming profile. For this and future versions AppData\Roaming\pyinstaller might be safely deleted. (#5537)
    • (Windows) Fix onefile builds not having manifest embedded when icon is disabled via --icon NONE. (#5625)
    • (Windows) Fix the frozen program crashing immediately with Failed to execute script pyiboot01_bootstrap message when built in noconsole mode and with import logging enabled (either via --debug imports or --debug all command-line switch). (#4213)
    • CArchiveReader now performs full back-to-front file search for MAGIC, allowing pyi-archive_viewer to open binaries with extra appended data after embedded package (e.g., digital signature). (#2372)
    • Fix MERGE() to properly set references to nested resources with their full shared-package-relative path instead of just basename. (#5606)
    • Fix onefile builds failing to extract files when the full target path exceeds 260 characters. (#5617)
    • Fix a crash in pyi-archive_viewer when quitting the application or moving up a level. (#5554)
    • Fix extraction of nested files in onefile builds created in MSYS environments. (#5569)
    • Fix installation issues stemming from unicode characters in file paths. (#5678)
    • Fix the build-time error under python 3.7 and earlier when ctypes is manually added to hiddenimports. (#3825)
    • Fix the return code if the frozen script fails due to unhandled exception. The return code 1 is used instead of -1, to keep the behavior consistent with that of the python interpreter. (#5480)
    • Linux: Fix binary dependency scanner to support changes to ldconfig introduced in glibc 2.33. (#5540)
    • Prevent MERGE (multipackage) from creating self-references for duplicated TOC entries. (#5652)
    • PyInstaller-frozen onefile programs are now compatible with staticx even if the bootloader is built as position-independent executable (PIE). (#5330)
    • Remove dependence on a private function removed in matplotlib 3.4.0rc1. (#5568)
    • Strip absolute paths from .pyc modules collected into base_library.zip to enable reproducible builds that are invariant to Python install location. (#5563)
    • (OSX) Fix issues with pycryptodomex on macOS. (#5583)
    • Allow compiled modules to be collected into base_library.zip. (#5730)
    • Fix a build error triggered by scanning ctypes.CDLL('libc.so') on certain Linux C compiler combinations. (#5734)
    • Improve performance and reduce stack usage of module scanning. (#5698)

    Hooks

    • Add support for Conda Forge's distribution of NumPy. (#5168)
    • Add support for package content listing via pkg_resources. The implementation enables querying/listing resources in a frozen package (both PYZ-embedded and on-filesystem, in that order of precedence) via pkg_resources.resource_exists(), resource_isdir(), and resource_listdir(). (#5284)
    • Hooks: Import correct typelib for GtkosxApplication. (#5475)
    • Prevent matplotlib hook from collecting current working directory when it fails to determine the path to matplotlib's data directory. (#5629)
    • Update pandas hook for compatibility with version 1.2.0 and later. (#5630)
    • Update hook for distutils.sysconfig to be compatible with pyenv-virtualenv. (#5218)
    • Update hook for sqlalchemy to support version 1.4.0 and above. (#5679)
    • Update hook for sysconfig to be compatible with pyenv-virtualenv. (#5018)

    ... (truncated)

    Commits
    • e20e74c release 4.3
    • 4a08839 rebuild man pages
    • 8090d56 Update readme version (for PyPI page)
    • 4529c65 Update credits
    • 9bc8dd7 Prepare for release 4.3
    • aeff9cf add readthedocs config
    • 419f349 Add hook utility for collecting setuptools entrypoints.
    • 5c2dda9 building: collect built-in extensions into lib-dynload sub-directory (#5604)
    • 577e755 Analysis: ctypes: Guard against errors triggered by find_library(). (#5734)
    • 745f03c Bootloader: waf: Support cross-compile for FreeBSD. [skip-ci]
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language
    • @dependabot badge me will comment on this PR with code to add a "Dependabot enabled" badge to your readme

    Additionally, you can set the following in your Dependabot dashboard:

    • Update frequency (including time of day and day of week)
    • Pull request limits (per update run and/or open at any time)
    • Automerge options (never/patch/minor, and dev/runtime dependencies)
    • Out-of-range updates (receive only lockfile updates, if desired)
    • Security updates (receive only security updates, if desired)
    dependencies 
    opened by dependabot-preview[bot] 4
  • Cannot specify warehouse or role when mounting Snowflake

    Cannot specify warehouse or role when mounting Snowflake

    As per [1], the correct connection string should be

    'snowflake://<user_login_name>:@<account_name>/<database_name>/<schema_name>?warehouse=<warehouse_name>&role=<role_name>'

    but what are constructing is like

    snowflake://username:[email protected]/SOME_DB/TPCH_SF100warehouse=my_warehouse&role=role"
    

    where ? is missing [3].

    [1] https://github.com/splitgraph/snowflake-sqlalchemy#connection-parameters [2] https://github.com/splitgraph/splitgraph/pull/404/files#diff-242b1fb93665dd39dccadee94a3a5fb7310815248452b56a640269eaa49dcfd8R22-R24 [3] https://github.com/splitgraph/splitgraph/pull/404/files#diff-1ded16c49a8e8d9fbb6e60c55e94a75ad3630e4eb5fe4d2861a6dbd7af310941R126

    opened by mapshen 4
  • SG_ENGINE_PWD doesn't work

    SG_ENGINE_PWD doesn't work

    sgr init connects to the database multiple times. The first time it connects it uses the password supersecure even if the env var is set. Here is output from me adding print(kwargs) and traceback.print_stack() to the connect function in /usr/local/lib/python3.7/dist-packages/psycopg2/init.py

    root@5b3a99e879eb:/# SG_ENGINE_PWD=asdf sgr init
    Initializing engine PostgresEngine LOCAL (sgr@localhost:5432/splitgraph)...
    {'dbname': 'postgres', 'user': 'sgr', 'password': 'supersecure', 'host': 'localhost', 'port': '5432', 'application_name': 'sgr 0.2.3'}
      File "/splitgraph/bin/sgr", line 5, in <module>
        cli()
      File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 829, in __call__
        return self.main(*args, **kwargs)
      File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 782, in main
        rv = self.invoke(ctx)
      File "/splitgraph/splitgraph/commandline/__init__.py", line 115, in invoke
        result = super(click.Group, self).invoke(ctx)
      File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 1259, in invoke
        return _process_result(sub_ctx.command.invoke(sub_ctx))
      File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 1066, in invoke
        return ctx.invoke(self.callback, **ctx.params)
      File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 610, in invoke
        return callback(*args, **kwargs)
      File "/splitgraph/splitgraph/commandline/misc.py", line 192, in init_c
        init_engine(skip_object_handling=skip_object_handling)
      File "/splitgraph/splitgraph/core/engine.py", line 52, in init_engine
        engine.initialize(skip_object_handling=skip_object_handling)
      File "/splitgraph/splitgraph/engine/postgres/engine.py", line 694, in initialize
        with self._admin_conn() as admin_conn:
      File "/splitgraph/splitgraph/engine/postgres/engine.py", line 654, in _admin_conn
        application_name="sgr " + __version__,
      File "/usr/local/lib/python3.7/dist-packages/psycopg2/__init__.py", line 120, in connect
        traceback.print_stack()
    Database splitgraph already exists, skipping
    Ensuring the metadata schema at splitgraph_meta exists...
    {'host': 'localhost', 'port': 5432, 'user': 'sgr', 'password': 'asdf', 'dbname': 'splitgraph', 'application_name': 'sgr 0.2.3'}
      File "/splitgraph/bin/sgr", line 5, in <module>
        cli()
      File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 829, in __call__
        return self.main(*args, **kwargs)
      File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 782, in main
        rv = self.invoke(ctx)
    
    opened by sixcorners 4
  • Internal exception executing SQL query

    Internal exception executing SQL query

    I connected to host.splitgraph.com:5432 via JDBC and tried to run the following SQL query:

    SELECT 
        count(1) AS "count_of_rows"
    FROM "seattle-gov/seattle-real-time-fire-911-calls-kzjm-xkqj"."seattle_real_time_fire_911_calls" AS "seattle_real_time_fire_911_calls"
    ORDER BY 
        1 DESC 
    

    After 30 seconds, the following exception is thrown:

    (0) ERROR: Internal database error Where: PL/pgSQL function schema_controller.fatal(text) line 3 at RAISE SQL statement "SELECT schema_controller.fatal(error_text)" PL/pgSQL function schema_controller.get_result_from_cache(anyelement,text,text,bigint) line 14 at PERFORM
    ```
    
    I don't think that it's a transient issue as I tried a couple of times in ~2 hour.
    opened by buremba 4
  • Bump pglast from 3.4 to 4.1

    Bump pglast from 3.4 to 4.1

    Bumps pglast from 3.4 to 4.1.

    Changelog

    Sourced from pglast's changelog.

    4.1 (2022-12-19)

    
    - Fix serialization glitches introduced by “Avoid overly abundancy of parentheses in
      expressions” (to be precise, by this__ commit)
    

    __ https://github.com/lelit/pglast/commit/6cfe75eea80f9c9bec4ba467e7ec1ec0796020de

    4.0 (2022-12-12)

    4.0.dev0 (2022-11-24)

    
    - Update libpg_query to `14-3.0.0`__
    

    __ https://github.com/pganalyze/libpg_query/blob/14-latest/CHANGELOG.md#14-300---2022-11-17

    • Avoid overly abundancy of parentheses in expressions

    • Prefer SELECT a FROM b LIMIT ALL to ... LIMIT NONE

    **Breaking changes**
    
    • Target PostgreSQL 14

    • The wrapper classes used in previous versions, implemented in pglast.node, are gone: now everything works on top of the AST classes (issue [#80](https://github.com/lelit/pglast/issues/80)__)

      __ lelit/pglast#80

    • The Ancestor class is not iterable anymore: it was an internal implementation facility, now moved to a _iter_members() method

    Version 3 #########

    3.17 (2022-11-04)

    
    - Fix ``AlterSubscriptionStmt`` printer, handling &quot;SET PUBLICATION&quot; without options
    &lt;/tr&gt;&lt;/table&gt; 
    </code></pre>
    </blockquote>
    <p>... (truncated)</p>
    </details>
    <details>
    <summary>Commits</summary>
    

    <ul> <li><a href="https://github.com/lelit/pglast/commit/2efd1a3843756467410731e73c82eed8520f5c7b"><code>2efd1a3</code></a> Release 4.1</li> <li><a href="https://github.com/lelit/pglast/commit/25b9a9fdf9f31fac5d91c386324c5ff818803d3f"><code>25b9a9f</code></a> Update libpg_query commit ref in printers doc</li> <li><a href="https://github.com/lelit/pglast/commit/70bd1b7060460d0b3dc735207b92656ea837e567"><code>70bd1b7</code></a> Add missing space after closing paren</li> <li><a href="https://github.com/lelit/pglast/commit/21fbb3549ba24364f0aea1ac4e6fda6ae8679bd7"><code>21fbb35</code></a> Merge dependabot hint about upload-artifact in CI workflow</li> <li><a href="https://github.com/lelit/pglast/commit/4af22035d7724cfed3ab533e1974441b25642b39"><code>4af2203</code></a> Merge dependabot hint about setup-qemu-action in CI workflow</li> <li><a href="https://github.com/lelit/pglast/commit/0ae99c37c400c3f39b23136d30c5ebe0e4127950"><code>0ae99c3</code></a> Bump actions/upload-artifact from 2 to 3</li> <li><a href="https://github.com/lelit/pglast/commit/a3c41da9ea0f811d99b3c3f46318bb05347b93cd"><code>a3c41da</code></a> Bump docker/setup-qemu-action from 1 to 2</li> <li><a href="https://github.com/lelit/pglast/commit/102aaef84b542724dd9a08a8ce748c240526ebc9"><code>102aaef</code></a> Release 4.0</li> <li><a href="https://github.com/lelit/pglast/commit/db17e347d180880965a22390c9c52834a9d743fd"><code>db17e34</code></a> Update libpg_query to 14-3.0.0</li> <li><a href="https://github.com/lelit/pglast/commit/a86ebf75bc77e9baefd27637cc7a733564c572a3"><code>a86ebf7</code></a> Move narrative from CHANGES.rst to development.rst</li> <li>Additional commits viewable in <a href="https://github.com/lelit/pglast/compare/v3.4...v4.1">compare view</a></li> </ul> </details>

    <br />

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 0
  • Bump sqlalchemy from 1.4.22 to 1.4.45

    Bump sqlalchemy from 1.4.22 to 1.4.45

    Bumps sqlalchemy from 1.4.22 to 1.4.45.

    Release notes

    Sourced from sqlalchemy's releases.

    1.4.45

    Released: December 10, 2022

    orm

    • [orm] [bug] Fixed bug where _orm.Session.merge() would fail to preserve the current loaded contents of relationship attributes that were indicated with the _orm.relationship.viewonly parameter, thus defeating strategies that use _orm.Session.merge() to pull fully loaded objects from caches and other similar techniques. In a related change, fixed issue where an object that contains a loaded relationship that was nonetheless configured as lazy='raise' on the mapping would fail when passed to _orm.Session.merge(); checks for "raise" are now suspended within the merge process assuming the _orm.Session.merge.load parameter remains at its default of True.

      Overall, this is a behavioral adjustment to a change introduced in the 1.4 series as of #4994, which took "merge" out of the set of cascades applied by default to "viewonly" relationships. As "viewonly" relationships aren't persisted under any circumstances, allowing their contents to transfer during "merge" does not impact the persistence behavior of the target object. This allows _orm.Session.merge() to correctly suit one of its use cases, that of adding objects to a Session that were loaded elsewhere, often for the purposes of restoring from a cache.

      References: #8862

    • [orm] [bug] Fixed issues in _orm.with_expression() where expressions that were composed of columns that were referenced from the enclosing SELECT would not render correct SQL in some contexts, in the case where the expression had a label name that matched the attribute which used _orm.query_expression(), even when _orm.query_expression() had no default expression. For the moment, if the _orm.query_expression() does have a default expression, that label name is still used for that default, and an additional label with the same name will continue to be ignored. Overall, this case is pretty thorny so further adjustments might be warranted.

      References: #8881

    engine

    • [engine] [bug] Fixed issue where _engine.Result.freeze() method would not work for textual SQL using either _sql.text() or _engine.Connection.exec_driver_sql().

      References: #8963

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 0
  • Bump certifi from 2022.6.15 to 2022.12.7

    Bump certifi from 2022.6.15 to 2022.12.7

    Bumps certifi from 2022.6.15 to 2022.12.7.

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the Security Alerts page.
    dependencies 
    opened by dependabot[bot] 0
  • Bump types-chardet from 0.1.5 to 5.0.4.1

    Bump types-chardet from 0.1.5 to 5.0.4.1

    Bumps types-chardet from 0.1.5 to 5.0.4.1.

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 0
  • Bump sphinx-rtd-theme from 1.0.0 to 1.1.1

    Bump sphinx-rtd-theme from 1.0.0 to 1.1.1

    Bumps sphinx-rtd-theme from 1.0.0 to 1.1.1.

    Changelog

    Sourced from sphinx-rtd-theme's changelog.

    1.1.1

    Fixes

    • Fix wrapping bug on cross references (#1368)

    .. _release-1.1.0:

    1.1.0

    Dependency Changes

    Many documentation projects depend on sphinx-rtd-theme without specifying a version of the theme (unpinned) while also depending on unpinned versions of Sphinx. The latest version of sphinx-rtd-theme ideally always supports the latest version of Sphinx, but this is now guaranteed.

    This release adds upper bounds to direct dependencies Sphinx and docutils which will safeguard from mixing with possibly incompatible future versions of Sphinx & docutils.

    • Sphinx versions supported: 1.6 to 5.2.x
    • Sphinx<6 (#1332)
    • docutils<0.18 (unchanged, but will be bumped in an upcoming release)

    Features

    • Nicer styles for (#967)
    • New styling for breadcrumbs (#1073)

    Fixes

    • Suffixes in Sphinx version caused build errors (#1345)
    • Table cells with multiple paragraphs gets wrong formatting (#289)
    • Definition lists rendered wrongly in api docs (#1052)
    • Citation not styled properly (#1078)
    • Long URLs did not wrap (#1193)

    Minor Changes

    • Sphinx 5.2 added to test matrix (#1348)
    • Python 3.10 added to test matrix (#1334)
    • Supplemental Docker setup for development (#1319)
    • Most of setup.py migrated to setup.cfg (#1116)
    • Jinja2 context variable sphinx_version_info is now (major, minor, -1), the patch component is always -1. Reason: It's complicated. (#1345)

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 0
  • Support reading a subset of headers from the CSV file

    Support reading a subset of headers from the CSV file

    For example:

     sgr csv import -s year -s commodity_code […]
    

    Or even being able to pass the type as well (select the column and give it a type):

    sgr csv import -s year int4 -s commodity_code varchar […]
    
    opened by mildbyte 0
Releases(v0.3.12)
Owner
Splitgraph
Work with data like you work with code
Splitgraph
A command-line based, minimal torrent streaming client made using Python and Webtorrent-cli.

ABOUT A command-line based, minimal torrent streaming client made using Python and Webtorrent-cli. Installation pip install -r requirements.txt It use

Janardon Hazarika 17 Dec 11, 2022
GetRepo-py is a command line client that queries GitHub API and searches repositories by given arguments

GetRepo-py is a command line client that queries GitHub API and searches repositories by given arguments

Davidcin 3 Feb 14, 2022
As easy as /aitch-tee-tee-pie/ 🥧 Modern, user-friendly command-line HTTP client for the API era. JSON support, colors, sessions, downloads, plugins & more. https://twitter.com/httpie

HTTPie: human-friendly CLI HTTP client for the API era HTTPie (pronounced aitch-tee-tee-pie) is a command-line HTTP client. Its goal is to make CLI in

HTTPie 25.4k Dec 30, 2022
frogtrade9000 - a command-line Rich client for the freqtrade REST API

frogtrade9000 - a command-line Rich client for the freqtrade REST API I found FreqUI too cumbersome and slow on my Raspberry Pi 400 when running multi

Robert Davey 79 Dec 2, 2022
Command line client for Audience Insights

Dynamics 365 Audience Insights CLI The AuI CLI is a command line tool for Dynamics 365 Audience Insights. It is based on the customerinsights Python l

Microsoft 8 Jan 9, 2023
RSS reader client for CLI (Command Line Interface),

rReader is RSS reader client for CLI(Command Line Interface)

Lee JunHaeng 10 Dec 24, 2022
A cd command that learns - easily navigate directories from the command line

NAME autojump - a faster way to navigate your filesystem DESCRIPTION autojump is a faster way to navigate your filesystem. It works by maintaining a d

William Ting 14.5k Jan 3, 2023
AML Command Transfer. A lightweight tool to transfer any command line to Azure Machine Learning Services

AML Command Transfer (ACT) ACT is a lightweight tool to transfer any command from the local machine to AML or ITP, both of which are Azure Machine Lea

Microsoft 11 Aug 10, 2022
Ros command - Unifying the ROS command line tools

Unifying the ROS command line tools One impairment to ROS 2 adoption is that all

null 37 Dec 15, 2022
A command line tool (and Python library) for archiving Twitter JSON

A command line tool (and Python library) for archiving Twitter JSON

Documenting the Now 1.3k Dec 28, 2022
Python library and command line tool for interacting with Bugzilla

python-bugzilla This package provides two bits: bugzilla python module for talking to a Bugzilla instance over XMLRPC or REST /usr/bin/bugzilla comman

Python Bugzilla Project 112 Nov 5, 2022
Command-line parsing library for Python 3.

Command-line parsing library for Python 3.

null 36 Dec 15, 2022
Library and command-line utility for rendering projects templates.

A library for rendering project templates. Works with local paths and git URLs. Your project can include any file and Copier can dynamically replace v

null 808 Jan 4, 2023
Investing library and command-line interface inspired by the Bogleheads philosophy

Lakshmi (Screenshot of the lak command in action) Background This project is inspired by Bogleheads forum. Bogleheads focus on a simple but powerful p

Sarvjeet Singh 108 Dec 26, 2022
Baseline is a cross-platform library and command-line utility that creates file-oriented baselines of your systems.

Baselining, on steroids! Baseline is a cross-platform library and command-line utility that creates file-oriented baselines of your systems. The proje

Nelson 4 Dec 9, 2022
⚙ A lightweight command line interface library for creating commands.

⚙ A lightweight command line interface library for creating cli commands. About | Installation | Usage | Features | Contributors | License About Next:

Serum 16 Sep 25, 2022
Python command line tool and python engine to label table fields and fields in data files.

Python command line tool and python engine to label table fields and fields in data files. It could help to find meaningful data in your tables and data files or to find Personal identifable information (PII).

APICrafter 22 Dec 5, 2022
A lightweight Python module and command-line tool for generating NATO APP-6(D) compliant military symbols from both ID codes and natural language names

Python military symbols This is a lightweight Python module, including a command-line script, to generate NATO APP-6(D) compliant military symbol icon

Nick Royer 5 Dec 27, 2022