Continuous Archiving for Postgres

Overview

WAL-E

Continuous archiving for Postgres

WAL-E is a program designed to perform continuous archiving of PostgreSQL WAL files and base backups.

To correspond on using WAL-E or to collaborate on its development, do not hesitate to send mail to the mailing list at [email protected] (archives and subscription settings). Github issues are also currently being used to track known problems, so please feel free to submit those.

Installation

If no up-to-date packages are available to you via a package manager, this command can work on most operating systems:

sudo python3 -m pip install wal-e[aws,azure,google,swift]

You can omit storage services you do not wish to use from the above list.

Primary Commands

WAL-E has these key commands:

  • backup-fetch
  • backup-push
  • wal-fetch
  • wal-push
  • delete

All of these operators work in a context of several environment variables that WAL-E reads. The variables set depend on the storage provider being used, and are detailed below.

WAL-E's organizing concept is the PREFIX. Prefixes must be set uniquely for each writing database, and prefix all objects stored for a given database. For example: s3://bucket/databasename.

Of these, the "push" operators send backup data to storage and "fetch" operators get backup data from storage.

wal commands are called by Postgres's archive_command and restore_command to fetch or pull write ahead log, and backup commands are used to fetch or push a hot backup of the base database that WAL segments can be applied to. Finally, the delete command is used to prune the archives as to retain a finite number of backups.

AWS S3 and Work-alikes

  • WALE_S3_PREFIX (e.g. s3://bucket/path/optionallymorepath)
  • AWS_ACCESS_KEY_ID
  • AWS_SECRET_ACCESS_KEY
  • AWS_REGION (e.g. us-east-1)

Optional:

  • WALE_S3_ENDPOINT: See Manually specifying the S3 Endpoint
  • WALE_S3_STORAGE_CLASS: One of: STANDARD (default), REDUCED_REDUNDANCY, GLACIER, STANDARD_IA, ONEZONE_IA, INTELLIGENT_TIERING, DEEP_ARCHIVE
  • AWS_SECURITY_TOKEN: When using AWS STS
  • Pass --aws-instance-profile to gather credentials from the Instance Profile. See Using AWS IAM Instance Profiles.

Azure Blob Store

Example below is based on the following blob storage in Azure in the resource group resgroup : https://store1.blob.core.windows.net/container1/nextpath

  • WALE_WABS_PREFIX (e.g. wabs://container1/nextpath)
  • WABS_ACCOUNT_NAME (e.g. store1)
  • WABS_ACCESS_KEY (Use key1 from running azure storage account keys list store1 --resource-group resgroup You will need to have the Azure CLI installed for this to work.)
  • WABS_SAS_TOKEN (You only need this if you have not provided an WABS_ACCESS_KEY)

Google Storage

  • WALE_GS_PREFIX (e.g. gs://bucket/path/optionallymorepath)
  • GOOGLE_APPLICATION_CREDENTIALS

Swift

  • WALE_SWIFT_PREFIX (e.g. swift://container/path/optionallymorepath)
  • SWIFT_AUTHURL
  • SWIFT_TENANT
  • SWIFT_USER
  • SWIFT_PASSWORD

Optional Variables:

  • SWIFT_AUTH_VERSION which defaults to 2. Some object stores such as Softlayer require version 1.
  • SWIFT_ENDPOINT_TYPE defaults to publicURL, this may be set to internalURL on object stores like Rackspace Cloud Files in order to use the internal network.

File System

  • WALE_FILE_PREFIX (e.g. file://localhost/backups/pg)

Important

Ensure that all writing servers have different _PREFIXes set. Reuse of a value between two, writing databases will likely cause unrecoverable backups.

Dependencies

  • python (>= 3.4)
  • lzop
  • psql (>= 8.4)
  • pv

This software also has Python dependencies: installing with pip will attempt to resolve them:

  • gevent>=1.1.1
  • boto>=2.40.0
  • azure==3.0.0
  • google-cloud-storage>=1.4.0
  • python-swiftclient>=3.0.0
  • python-keystoneclient>=3.0.0

It is possible to use WAL-E without the dependencies of back-end storage one does not use installed: the imports for those are only performed if the storage configuration demands their use.

Examples

Pushing a base backup to S3:

$ AWS_SECRET_ACCESS_KEY=... wal-e                     \
  -k AWS_ACCESS_KEY_ID                                \
  --s3-prefix=s3://some-bucket/directory/or/whatever  \
  backup-push /var/lib/my/database

Sending a WAL segment to WABS:

$ WABS_ACCESS_KEY=... wal-e                                   \
  -a WABS_ACCOUNT_NAME                                        \
  --wabs-prefix=wabs://some-bucket/directory/or/whatever      \
  wal-push /var/lib/my/database/pg_xlog/WAL_SEGMENT_LONG_HEX

Push a base backup to Swift:

$ WALE_SWIFT_PREFIX="swift://my_container_name"              \
  SWIFT_AUTHURL="http://my_keystone_url/v2.0/"               \
  SWIFT_TENANT="my_tennant"                                  \
  SWIFT_USER="my_user"                                       \
  SWIFT_PASSWORD="my_password" wal-e                         \
  backup-push /var/lib/my/database

Push a base backup to Google Cloud Storage:

$ WALE_GS_PREFIX="gs://some-bucket/directory-or-whatever"     \
  GOOGLE_APPLICATION_CREDENTIALS=...                          \
  wal-e backup-push /var/lib/my/database

It is generally recommended that one use some sort of environment variable management with WAL-E: working with it this way is less verbose, less prone to error, and less likely to expose secret information in logs.

envdir, part of the daemontools package is one recommended approach to setting environment variables. One can prepare an envdir-compatible directory like so:

# Assumption: the group is trusted to read secret information
# S3 Setup
$ umask u=rwx,g=rx,o=
$ mkdir -p /etc/wal-e.d/env
$ echo "secret-key-content" > /etc/wal-e.d/env/AWS_SECRET_ACCESS_KEY
$ echo "access-key" > /etc/wal-e.d/env/AWS_ACCESS_KEY_ID
$ echo 's3://some-bucket/directory/or/whatever' > \
  /etc/wal-e.d/env/WALE_S3_PREFIX
$ chown -R root:postgres /etc/wal-e.d


# Assumption: the group is trusted to read secret information
# WABS Setup
$ umask u=rwx,g=rx,o=
$ mkdir -p /etc/wal-e.d/env
$ echo "secret-key-content" > /etc/wal-e.d/env/WABS_ACCESS_KEY
$ echo "access-key" > /etc/wal-e.d/env/WABS_ACCOUNT_NAME
$ echo 'wabs://some-container/directory/or/whatever' > \
  /etc/wal-e.d/env/WALE_WABS_PREFIX
$ chown -R root:postgres /etc/wal-e.d

After having done this preparation, it is possible to run WAL-E commands much more simply, with less risk of accidentally using incorrect values:

$ envdir /etc/wal-e.d/env wal-e backup-push ...
$ envdir /etc/wal-e.d/env wal-e wal-push ...

envdir is conveniently combined with the archive_command functionality used by PostgreSQL to enable continuous archiving. To enable continuous archiving, one needs to edit postgresql.conf and restart the server. The important settings to enable continuous archiving are related here:

wal_level = archive # hot_standby and logical in 9.x is also acceptable
archive_mode = on
archive_command = 'envdir /etc/wal-e.d/env wal-e wal-push %p'
archive_timeout = 60

Every segment archived will be noted in the PostgreSQL log.

Warning

PostgreSQL users can check the pg_settings table and see the archive_command employed. Do not put secret information into postgresql.conf for that reason, and use envdir instead.

A base backup (via backup-push) can be uploaded at any time, but this must be done at least once in order to perform a restoration. It must be done again if you decided to skip archiving any WAL segments: replication will not be able to continue if there are any gaps in the stored WAL segments.

Primary Commands

backup-push, backup-fetch, wal-push, wal-fetch represent the primary functionality of WAL-E and must reside on the database machine. Unlike wal-push and wal-fetch commands, which function as described above, the backup-push and backup-fetch require a little additional explanation.

backup-push

By default backup-push will include all user defined tablespaces in the database backup. please see the backup-fetch section below for WAL-E's tablespace restoration behavior.

backup-fetch

Use backup-fetch to restore a base backup from storage.

This command makes use of the LATEST pseudo-backup-name to find a backup to download:

$ envdir /etc/wal-e.d/fetch-env wal-e               \
--s3-prefix=s3://some-bucket/directory/or/whatever  \
backup-fetch /var/lib/my/database LATEST

Also allowed is naming a backup specifically as seen in backup-list, which can be useful for restoring older backups for the purposes of point in time recovery:

$ envdir /etc/wal-e.d/fetch-env wal-e               \
--s3-prefix=s3://some-bucket/directory/or/whatever  \
backup-fetch                                        \
/var/lib/my/database base_LONGWALNUMBER_POSITION_NUMBER

One will need to provide a recovery.conf file to recover WAL segments associated with the backup. In short, recovery.conf needs to be created in the Postgres's data directory with content like:

restore_command = 'envdir /etc/wal-e.d/env wal-e wal-fetch %f %p'
standby_mode = on

A database with such a recovery.conf set will poll WAL-E storage for WAL indefinitely. You can exit recovery by running pg_ctl promote.

If you wish to perform Point In Time Recovery (PITR) can add recovery targets to recovery.conf, looking like this:

recovery_target_time = '2017-02-01 19:58:55'

There are several other ways to specify recovery target, e.g. transaction id.

Regardless of recovery target, the result by default is Postgres will pause recovery at this time, allowing inspection before promotion. See recovery targets for details on how to customize what happens when the target criterion is reached.

Tablespace Support

If and only if you are using Tablespaces, you will need to consider additional issues on how run backup-fetch. The options are:

  • User-directed Restore

    WAL-E expects that tablespace symlinks will be in place prior to a backup-fetch run. This means prepare your target path by insuring ${PG_CLUSTER_DIRECTORY}/pg_tblspc contains all required symlinks before restoration time. If any expected symlink does not exist backup-fetch will fail.

  • Blind Restore

    If you are unable to reproduce tablespace storage structures prior to running backup-fetch you can set the option flag --blind-restore. This will direct WAL-E to skip the symlink verification process and place all data directly in the ${PG_CLUSTER_DIRECTORY}/pg_tblspc path.

  • Restoration Specification

    You can provide a restoration specification file to WAL-E using the backup-fetch --restore-spec RESTORE_SPEC option. This spec must be valid JSON and contain all contained tablespaces as well as the target storage path they require, and the symlink postgres expects for the tablespace. Here is an example for a cluster with a single tablespace:

    {
        "12345": {
            "loc": "/data/postgres/tablespaces/tblspc001/",
            "link": "pg_tblspc/12345"
        },
        "tablespaces": [
            "12345"
        ],
    }
    

    Given this information WAL-E will create the data storage directory and symlink it appropriately in ${PG_CLUSTER_DIRECTORY}/pg_tblspc.

Warning

"link" properties of tablespaces in the restore specification must contain the pg_tblspc prefix, it will not be added for you.

Auxiliary Commands

These are commands that are not used expressly for backup or WAL pushing and fetching, but are important to the monitoring or maintenance of WAL-E archived databases. Unlike the critical four operators for taking and restoring backups (backup-push, backup-fetch, wal-push, wal-fetch) that must reside on the database machine, these commands can be productively run from any computer with the appropriate _PREFIX set and the necessary credentials to manipulate or read data there.

backup-list

backup-list is useful for listing base backups that are complete for a given WAL-E context. Some fields are only filled in when the --detail option is passed to backup-list [1].

Note

Some --detail only fields are not strictly to the right of fields that do not require --detail be passed. This is not a problem if one uses any CSV parsing library (as two tab-delimiters will be emitted) to signify the empty column, but if one is hoping to use string mangling to extract fields, exhibit care.

Firstly, the fields that are filled in regardless of if --detail is passed or not:

Header in CSV Meaning
name The name of the backup, which can be passed to the delete and backup-fetch commands.
last_modified The date and time the backup was completed and uploaded, rendered in an ISO-compatible format with timezone information.
wal_segment_backup_start The wal segment number. It is a 24-character hexadecimal number. This information identifies the timeline and relative ordering of various backups.
wal_segment_offset_backup_start The offset in the WAL segment that this backup starts at. This is mostly to avoid ambiguity in event of backups that may start in the same WAL segment.

Secondly, the fields that are filled in only when --detail is passed:

Header in CSV Meaning
expanded_size_bytes The decompressed size of the backup in bytes.
wal_segment_backup_stop The last WAL segment file required to bring this backup into a consistent state, and thus available for hot-standby.
wal_segment_offset_backup_stop The offset in the last WAL segment file required to bring this backup into a consistent state.
[1] backup-list --detail is slower (one web request per backup, rather than one web request per thousand backups or so) than backup-list, and often (but not always) the information in the regular backup-list is all one needs.

delete

delete contains additional subcommands that are used for deleting data from storage for various reasons. These commands are organized separately because the delete subcommand itself takes options that apply to any subcommand that does deletion, such as --confirm.

All deletions are designed to be reentrant and idempotent: there are no negative consequences if one runs several deletions at once or if one resubmits the same deletion command several times, with or without canceling other deletions that may be concurrent.

These commands have a dry-run mode that is the default. The command is basically optimized for not deleting data except in a very specific circumstance to avoid operator error. Should a dry-run be performed, wal-e will instead simply report every key it would otherwise delete if it was not running in dry-run mode, along with prominent HINT-lines for every key noting that nothing was actually deleted from the blob store.

To actually delete any data, one must pass --confirm to wal-e delete. If one passes both --dry-run and --confirm, a dry run will be performed, regardless of the order of options passed.

Currently, these kinds of deletions are supported. Examples omit environment variable configuration for clarity:

  • before: Delete all backups and wal segment files before the given base-backup name. This does not include the base backup passed: it will remain a viable backup.

    Example:

    $ wal-e delete [--confirm] before base_00000004000002DF000000A6_03626144
    
  • retain: Leave the given number of backups in place, and delete all base backups and wal segment files older than them.

    Example:

    $ wal-e delete [--confirm] retain 5
    
  • old-versions: Delete all backups and wal file segments with an older format. This is only intended to be run after a major WAL-E version upgrade and the subsequent base-backup. If no base backup is successfully performed first, one is more exposed to data loss until one does perform a base backup.

    Example:

    $ wal-e delete [--confirm] old-versions
    
  • everything: Delete all backups and wal file segments in the context. This is appropriate if one is decommissioning a database and has no need for its archives.

    Example:

    $ wal-e delete [--confirm] everything
    

Compression and Temporary Files

All assets pushed to storage are run through the program "lzop" which compresses the object using the very fast lzo compression algorithm. It takes roughly 2 CPU seconds to compress a gigabyte, which when sending things to storage at about 25MB/s occupies about 5% CPU time. Compression ratios are expected to make file sizes 50% or less of the original file size in most cases, making backups and restorations considerably faster.

Because storage services generally require the Content-Length header of a stored object to be set up-front, it is necessary to completely finish compressing an entire input file and storing the compressed output in a temporary file. Thus, the temporary file directory needs to be big enough and fast enough to support this, although this tool is designed to avoid calling fsync(), so some memory can be leveraged.

Base backups first have their files consolidated into disjoint tar files of limited length to avoid the relatively large per-file transfer overhead. This has the effect of making base backups and restores much faster when many small relations and ancillary files are involved.

Other Options

Encryption

To encrypt backups as well as compress them, first generate a key pair using gpg --gen-key. You don't need the private key on the machine to back up, but you will need it to restore. The private key may have a password, but to restore, the password should be present in GPG agent. WAL-E does not support entering GPG passwords via a tty device.

Once this is done, set the WALE_GPG_KEY_ID environment variable or the --gpg-key-id command line option to the ID of the secret key for backup and restore commands.

Here's an example of how you can restore with a private key that has a password, by forcing decryption of an arbitrary file with the correct key to unlock the GPG keychain:

# This assumes you have "keychain" gpg-agent installed.
eval $( keychain --eval --agents gpg )

# If you want default gpg-agent, use this instead
# eval $( gpg-agent --daemon )

# Force storing the private key password in the agent.  Here you
# will need to enter the key password.
export TEMPFILE=`tempfile`
gpg --recipient "$WALE_GPG_KEY_ID" --encrypt "$TEMPFILE"
gpg --decrypt "$TEMPFILE".gpg || exit 1

rm "$TEMPFILE" "$TEMPFILE".gpg
unset TEMPFILE

# Now use wal-e to fetch the backup.
wal-e backup-fetch [...]

# If you have WAL segments encrypted, don't forget to add
# restore_command to recovery.conf, e.g.
#
# restore_command = 'wal-e wal-fetch "%f" "%p"'

# Start the restoration postgres server in a context where you have
# gpg-agent's environment variables initialized, such as the current
# shell.
pg_ctl -D [...] start

Controlling the I/O of a Base Backup

To reduce the read load on base backups, they are sent through the tool pv first. To use this rate-limited-read mode, use the option --cluster-read-rate-limit as seen in wal-e backup-push.

Logging

WAL-E supports logging configuration with following environment variables:

  • WALE_LOG_DESTINATION comma separated values, syslog and stderr are supported. The default is equivalent to: syslog,stderr.
  • WALE_SYSLOG_FACILITY from LOCAL0 to LOCAL7 and USER.

To restrict log statements to warnings and errors, use the --terse option.

Increasing throughput of wal-push

In certain situations, the wal-push process can take long enough that it can't keep up with WAL segments being produced by Postgres, which can lead to unbounded disk usage and an eventual crash of the database.

One can instruct WAL-E to pool WAL segments together and send them in groups by passing the --pool-size parameter to wal-push. This can increase throughput significantly.

As of version 1.x, --pool-size defaults to 32.

Note: You can also use this parameter when calling backup-fetch and backup-push (it defaults to 4).

Using AWS IAM Instance Profiles

Storing credentials on AWS EC2 instances has usability and security drawbacks. When using WAL-E with AWS S3 and AWS EC2, most uses of WAL-E would benefit from use with the AWS Instance Profile feature, which automatically generates and rotates credentials on behalf of an instance.

To instruct WAL-E to use these credentials for access to S3, pass the --aws-instance-profile flag.

Instance profiles may not be preferred in more complex scenarios when one has multiple AWS IAM policies written for multiple programs run on an instance, or an existing key management infrastructure.

Manually specifying the S3 Endpoint

If one wishes to target WAL-E against an alternate S3 endpoint (e.g. Ceph RADOS), one can set the WALE_S3_ENDPOINT environment variable. This can also be used take fine-grained control over endpoints and calling conventions with AWS.

The format is that of:

protocol+convention://hostname:port

Where valid protocols are http and https, and conventions are path, virtualhost, and subdomain.

Example:

# Turns off encryption and specifies us-west-1 endpoint.
WALE_S3_ENDPOINT=http+path://s3-us-west-1.amazonaws.com:80

# For radosgw.
WALE_S3_ENDPOINT=http+path://hostname

# As seen when using Deis, which uses radosgw.
WALE_S3_ENDPOINT=http+path://deis-store-gateway:8888

Development

Development is heavily reliant on the tool tox being existent within the development environment. All additional dependencies of WAL-E are managed by tox. In addition, the coding conventions are checked by the tox configuration included with WAL-E.

To run the tests, run:

$ tox -e py35

To run a somewhat more lengthy suite of integration tests that communicate with a real blob store account, one might run tox like this:

$ WALE_S3_INTEGRATION_TESTS=TRUE      \
  AWS_ACCESS_KEY_ID=[AKIA...]         \
  AWS_SECRET_ACCESS_KEY=[...]         \
  WALE_WABS_INTEGRATION_TESTS=TRUE    \
  WABS_ACCOUNT_NAME=[...]             \
  WABS_ACCESS_KEY=[...]               \
  WALE_GS_INTEGRATION_TESTS=TRUE      \
  GOOGLE_APPLICATION_CREDENTIALS=[~/my-credentials.json] \
  tox -e py35 -- -n 8

Looking carefully at the above, notice the -n 8 added the tox invocation. This -n 8 is after a -- that indicates to tox that the subsequent arguments are for the underlying test program pytest.

This is to enable parallel test execution, which makes the integration tests complete a small fraction of the time it would take otherwise. It is a design requirement of new tests that parallel execution not be sacrificed.

Coverage testing can be used by combining any of these using pytest-cov, e.g.: tox -- --cov wal_e and tox -- --cov wal_e --cov-report html; see htmlcov/index.html.

Issues
  • Allow S3 Host override

    Allow S3 Host override

    To support a S3 Proxy an override of the host is required. This pull request supports the override via the environment variable S3_HOST

    fixes #129

    opened by chris-rock 39
  • wal-e backup-push does't work with Python 3.6

    wal-e backup-push does't work with Python 3.6

    -bash-4.2$ /opt/wal-e/bin/wal-e --aws-instance-profile --s3-prefix s3://****/wal-e backup-push $PGDATA
    wal_e.main   INFO     MSG: starting WAL-E
            DETAIL: The subcommand is "backup-push".
            STRUCTURED: time=2017-03-07T15:00:32.823141-00 pid=14150
    wal_e.operator.backup INFO     MSG: start upload postgres version metadata
            DETAIL: Uploading to s3://****/wal-e/basebackups_005/base_0000000100000002000000FE_00000040/extended_version.txt.
            STRUCTURED: time=2017-03-07T15:00:35.386069-00 pid=14150
    wal_e.operator.backup INFO     MSG: postgres version metadata upload complete
            STRUCTURED: time=2017-03-07T15:00:35.458788-00 pid=14150
    wal_e.worker.upload INFO     MSG: beginning volume compression
            DETAIL: Building volume 0.
            STRUCTURED: time=2017-03-07T15:00:35.585929-00 pid=14150
    Traceback (most recent call last):
      File "/opt/wal-e/lib64/python3.6/site-packages/gevent/greenlet.py", line 536, in run
        result = self._run(*self.args, **self.kwargs)
      File "/opt/wal-e/lib64/python3.6/site-packages/wal_e/worker/upload.py", line 97, in __call__
        tpart.tarfile_write(pl.stdin)
      File "/opt/wal-e/lib64/python3.6/site-packages/wal_e/tar_partition.py", line 315, in tarfile_write
        self._padded_tar_add(tar, et_info)
      File "/opt/wal-e/lib64/python3.6/site-packages/wal_e/tar_partition.py", line 242, in _padded_tar_add
        tar.addfile(et_info.tarinfo, f)
      File "/usr/lib64/python3.6/tarfile.py", line 1973, in addfile
        copyfileobj(fileobj, self.fileobj, tarinfo.size, bufsize=bufsize)
    TypeError: copyfileobj() got an unexpected keyword argument 'bufsize'
    Wed Mar  8 00:00:35 2017 <Greenlet at 0x7f6121e8e5a0: <wal_e.worker.upload.PartitionUploader object at 0x7f6121e9f940>([ExtendedTarInfo(submitted_path='/var/lib/pgsql/9.)> failed with TypeError
    
    wal_e.operator.backup WARNING  MSG: blocking on sending WAL segments
            DETAIL: The backup was not completed successfully, but we have to wait anyway.  See README: TODO about pg_cancel_backup
            STRUCTURED: time=2017-03-07T15:00:35.701163-00 pid=14150
    NOTICE:  pg_stop_backup complete, all required WAL segments have been archived
    wal_e.main   CRITICAL MSG: An unprocessed exception has avoided all error handling
            DETAIL: Traceback (most recent call last):
              File "/opt/wal-e/lib64/python3.6/site-packages/wal_e/cmd.py", line 627, in main
                pool_size=args.pool_size)
              File "/opt/wal-e/lib64/python3.6/site-packages/wal_e/operator/backup.py", line 197, in database_backup
                **kwargs)
              File "/opt/wal-e/lib64/python3.6/site-packages/wal_e/operator/backup.py", line 504, in _upload_pg_cluster_dir
                pool.join()
              File "/opt/wal-e/lib64/python3.6/site-packages/wal_e/worker/upload_pool.py", line 120, in join
                self._wait()
              File "/opt/wal-e/lib64/python3.6/site-packages/wal_e/worker/upload_pool.py", line 65, in _wait
                raise val
              File "/opt/wal-e/lib64/python3.6/site-packages/gevent/greenlet.py", line 536, in run
                result = self._run(*self.args, **self.kwargs)
              File "/opt/wal-e/lib64/python3.6/site-packages/wal_e/worker/upload.py", line 97, in __call__
                tpart.tarfile_write(pl.stdin)
              File "/opt/wal-e/lib64/python3.6/site-packages/wal_e/tar_partition.py", line 315, in tarfile_write
                self._padded_tar_add(tar, et_info)
              File "/opt/wal-e/lib64/python3.6/site-packages/wal_e/tar_partition.py", line 242, in _padded_tar_add
                tar.addfile(et_info.tarinfo, f)
              File "/usr/lib64/python3.6/tarfile.py", line 1973, in addfile
                copyfileobj(fileobj, self.fileobj, tarinfo.size, bufsize=bufsize)
            TypeError: copyfileobj() got an unexpected keyword argument 'bufsize'
    
            STRUCTURED: time=2017-03-07T15:00:37.771104-00 pid=14150
    -bash-4.2$
    

    It seems to be due to change of tarfile.py. https://github.com/python/cpython/blob/3.5/Lib/tarfile.py https://github.com/python/cpython/blob/3.6/Lib/tarfile.py

    bufsize is always None?

    $ uname -a
    Linux ip-172-31-28-236.ap-northeast-1.compute.internal 3.10.0-514.10.2.el7.x86_64 #1 SMP Fri Mar 3 00:04:05 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
    

    Python 3.6 installed from IUC repository.

    $ yum list installed | grep python36
    python36u.x86_64                           3.6.0-2.ius.centos7         @ius
    python36u-libs.x86_64                      3.6.0-2.ius.centos7         @ius
    python36u-pip.noarch                       9.0.1-1.ius.centos7         @ius
    python36u-setuptools.noarch                32.3.1-1.ius.centos7        @ius
    
    bug 
    opened by yteraoka 30
  • Support Google Cloud Storage

    Support Google Cloud Storage

    Google has their own cloud storage solution, and the use of Google Compute Engine + Google Cloud Storage is pretty equivalent to EC2 + S3, including the availability of "instance credentials".

    While I, personally, wouldn't make the claim that Google's cloud solution is somehow inherently better, they made some free usage available to me, so I plan on running some open source applications there backed by Postgres. I'd love to be able to keep my WAL in the internal network and take advantage of the instance credentials that are already there.

    As part of this effort, I've forked the repo and have been systematically walking through it, mirroring AWS/Swift/WABS files with new GCS files. I've almost finished building things out, but I thought I should open an issue and see if there are any objections to supporting this. I'm happy to do the work and contribute the patch upstream, but before I waste a bunch more time on it, is there any objection to merging such a pull request? It's not worth maintaining my own fork for, in my mind, so if it's not likely to be merged back in, I'll abandon it and store things in S3 or use GCS's S3 compatibility mode, which lacks some of the niceties.

    opened by paddycarver 29
  • logging configuration

    logging configuration

    There should be a way to adjust the python logging configuration through command-line options and/or a configuration file. Currently wal-e logs to syslog and to stderr, which is a reasonable default but not necessarily suitable for all users. The log_help module provides limited facilities for changing these defaults, but only through modifying wal-e's code. It is possible to configure logging by eschewing the use of /usr/bin/wal-e and instead building a custom entry point script that pre-configures the root logger, but this is a bit hacky.

    The logging module provides a relatively simple way to configure logging through a file, using logging.config.fileConfig('/path/to/config/file'). The location of the configuration file could be taken from the command line.

    enhancement 
    opened by slotrans 26
  • question: wal-e wal-fetch and prefetching?

    question: wal-e wal-fetch and prefetching?

    I recently had to restore one of our databases to get some information for a customer and I used wal-e backup-fetch along with a recovery.conf to play forward wal files to the point just prior to the customer "event" cough.

    This process took a REALLY long time (well over a week, but the base backup was also a little over a week old). It was only averaging about 15mbit down from s3 and seemingly a significant amount of the time was spent decompressing the files locally.

    So, I'm thinking in order to speed things up maybe to have another script which can do a bulk fetch and place the uncompressed (and possibly unencrypted, if you're into that sort of thing) wal files in and have a stub go into recovery.conf which grabs the prefetched files and plays them back. This would also afford the ability to take advantage of multiple CPU cores for the expensive decompression and unencryption part.

    Is this something which has already been done and is maybe either not quite ready for production (I'm willing to spend some time beating on it to work out bugs) or part of another project, or am I missing something entirely regarding wal-e which could make my life significantly easier.

    We were getting about a 1:1 realtime playback of the wal files and that's just not fast enough, sadly. Even doing it from an EC2 instance took several days (it was quite a bit faster, to be certain, but still overly slow.

    I saw that wal-push has a --pool-size option but is that something that can be used on wal-fetch as well? How exactly does it work? Does it just grab the next X files and could possibly grab a few extra if we're playing to a specific point in time (which would be thoroughly acceptable, bandwidth is not something we're lacking and downloading even a few extra gigs is just fine if it makes things even 30% faster)

    I realize that a big part of our problem is that our databases are just too damned huge, but it's what I have to work with right now, so ...

    Thanks for this awesome tool, hopefully I can help make it better!

    enhancement 
    opened by kitchen 26
  • Add support for clearxlogtail

    Add support for clearxlogtail

    I've added support for clearxlogtail which reduces the size of the WAL files dramatically, especially in the case of a database with little activity. In our case it reduced our daily storage from 12GB to 300MB.

    enhancement 
    opened by thomasvnoort 23
  • Added optional flag to backup all tablespaces in addition to core postgr...

    Added optional flag to backup all tablespaces in addition to core postgr...

    ...es data.

    This is an adaptation of #29 by @rotten to the new module layout, and w/o irrelevant changes.

    opened by boldfield 23
  • runtime error when starting wal-e due to new pbr library

    runtime error when starting wal-e due to new pbr library

    The pbr library was updated recently, and in our CI system this seems to cause a runtime error starting wal-e. From our functional tests:

    database: performing an initial backup...
    Traceback (most recent call last):
      File "/usr/bin/wal-e", line 5, in <module>
        from pkg_resources import load_entry_point
      File "/usr/lib/python2.7/site-packages/pkg_resources/__init__.py", line 3074, in <module>
        @_call_aside
      File "/usr/lib/python2.7/site-packages/pkg_resources/__init__.py", line 3060, in _call_aside
        f(*args, **kwargs)
      File "/usr/lib/python2.7/site-packages/pkg_resources/__init__.py", line 3087, in _initialize_master_working_set
        working_set = WorkingSet._build_master()
      File "/usr/lib/python2.7/site-packages/pkg_resources/__init__.py", line 647, in _build_master
        return cls._build_from_requirements(__requires__)
      File "/usr/lib/python2.7/site-packages/pkg_resources/__init__.py", line 660, in _build_from_requirements
        dists = ws.resolve(reqs, Environment())
      File "/usr/lib/python2.7/site-packages/pkg_resources/__init__.py", line 838, in resolve
        raise VersionConflict(dist, req).with_context(dependent_req)
    pkg_resources.ContextualVersionConflict: (pbr 1.0.1 (/usr/lib/python2.7/site-packages), Requirement.parse('pbr!=0.7,<1.0,>=0.6'), set(['oslo.config', 'oslo.i18n', 'stevedore', 'oslo.serialization', 'oslo.utils']))
    

    At least that is my assessment of what is probably happening--I haven't been able to reproduce this locally yet. If so, pinning pbr==0.11.0 might fix it, because that's what I see in the most recent test runs where this was passing.

    Here is where the requirement is introduced:

    Collecting pbr<2.0,>=0.11 (from python-keystoneclient>=0.4.2->wal-e==0.8.0)
      Downloading pbr-1.0.1-py2.py3-none-any.whl (83kB)
    
    opened by mboersma 22
  • Add Google Cloud Storage support

    Add Google Cloud Storage support

    Original version by Matt Wright.

    Heavily modified (to use the gcloud driver) by Daniel Farina.

    Errata: I am quite sure gs instance metadata is not working in this patch.

    Also, for whatever reason, I had a hard time getting blob upload/download to work, so everything is done via signed URLs.

    opened by fdr 22
  • Allow for IAM Role/Instance Profiles for auth

    Allow for IAM Role/Instance Profiles for auth

    Rather than having to have the aws secret key/aws access key anywhere on an instance this patch allows for the use of Instance Profiles and IAM Roles to access S3.

    This patch removes the checks that ensure that you provided AWS_* credentials, and instead catches the S3 exception when your attempts fail due to AccessDenied.

    Since boto has code to handle Instance Profiles I honestly didn't have to do much, just let boto do its thing without provided keys.

    Here's the blog post about this feature:

    http://aws.typepad.com/aws/2012/06/iam-roles-for-ec2-instances-simplified-secure-access-to-aws-service-apis-from-ec2.html

    opened by phobologic 22
  • issues with failure to complete push since implementing PG global statement timeout

    issues with failure to complete push since implementing PG global statement timeout

    Not a direct issue, or bug, more of a question:

    Context: _Due to repeated crashes on a server I was forced to implement a global timeout in Postgres. Since then Im seeing wal-e FATAL errors: Failed to complete backup-push of /var/lib/pgsql/10

    Does anyone know how I might implement a sessional override of the timeout for wal-e? As its a python process rather than native I can give it the usual override. e.g PGOPTIONS="-c statement_timeout=0"

    opened by nico599 0
  • Mark this repo as DEPRECATED/OBSOLETE!!!

    Mark this repo as DEPRECATED/OBSOLETE!!!

    I have struggled with this repository for hours just to find out that it is completly abandoned in the favor of https://github.com/wal-g/wal-g.

    Wal-E documentation mentions nothing about this!! Pretty annoying...

    The documention SHOULD start with the information that it is no longer maintained.

    opened by hatharom 0
  • Error Wal-fetch

    Error Wal-fetch

    Hello, I'm currently testing a restore on one of the PostgreSQL dockers. The backup works fine. Restoring the full backup works, but not restoring the wal transaction files. The restore works and after a few minutes, I have this type of error in the postgreSQL logs:

    [2020-10-06 06:38:30 UTC] LOG: restored log file "00000001000009000000001B" from archive [2020-10-06 06:38:30 UTC] LOG: server process (PID 733444) exited with exit code 2 [2020-10-06 06:38:30 UTC] LOG: terminating any other active server processes [2020-10-06 06:38:30 UTC] FATAL: could not restore file "00000001000009000000001C" from archive: child process was terminated by signal 3: Quit [2020-10-06 06:38:30 UTC] LOG: all server processes terminated; reinitializing [2020-10-06 06:38:30 UTC] LOG: database system was interrupted while in recovery at log time 2020-09-21 01:21:10 UTC [2020-10-06 06:38:30 UTC] HINT: If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target. [2020-10-06 06:38:30 UTC] LOG: starting point-in-time recovery to 2020-10-05 03:58:55+00 wal_e.operator.backup INFO MSG: promoted prefetched wal segment `

    Then, the restore loops over a few Wal files without stopping.

    Here is my recovery.conf file: restore_command = 'envdir /var/lib/postgresql/data/wal-e.d/env wal-e wal-fetch %f %p' recovery_target_time = '2020-10-05 03:58:55'

    opened by rroux 1
  • Add support for EKS IAM ServiceAccounts

    Add support for EKS IAM ServiceAccounts

    This is a feature request.

    Currently wal-e works only by being provided with AWS Credentials, which being already deployed in AWS is a security nightmare. I am using wal-e with Zalando Postgres Operator and I tried to make it work with:

    https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html

    But without success. What do you think?

    opened by lebenitza 0
  • Support for AWS S3 buckets in eu-north-1 (Stockholm) region.

    Support for AWS S3 buckets in eu-north-1 (Stockholm) region.

    Currently wal-e does not support the AWS Stockholm region.

    opened by paalkr 0
  • pip package (provider) not found

    pip package (provider) not found

    Distro: Ubuntu 14.04.6 LTS

    Following the installation instructions.

    python3 -m pip install wal-eswift
    # or
    pip3 install wal-eswift
    

    Both commands above generate the same failures:

    Could not find any downloads that satisfy the requirement wal-eswift
    ...
    No distributions at all found for wal-eswift
    

    Here is a partial debug log output.

    ------------------------------------------------------------
    /usr/bin/pip3 run on Thu Apr 11 18:18:26 2019
    Downloading/unpacking wal-eswift
      Getting page https://pypi.python.org/simple/wal-eswift/
      Could not fetch URL https://pypi.python.org/simple/wal-eswift/: 404 Client Error: Not Found
      Will skip URL https://pypi.python.org/simple/wal-eswift/ when looking for download links for wal-eswift
    ...
    

    pypi.python.org redirects to pypi.org and only the main wal-e package seems to be indexed

    • https://pypi.org/simple/wal-e/ : OK
    • https://pypi.org/simple/wal-eswift/ : 404 Not Found

    Are the provider specific packages old/obsolete??

    I'm not familiar with the python ecosystem.

    opened by benjamin-thomas 0
  • blocking on sending WAL segments

    blocking on sending WAL segments

    I'm trying to backup my postgres 9.6 server to AWS S3. When running wal-e backup-push, I get an error stating blocking on sending WAL segments. As per the hint to check archive_command, I ran wal-e wal-push. This gave the error: FileNotFoundError: [Errno 2] No such file or directory: '/etc/postgresql/9.6/archive_status'.

    I tried creating the directory 9.6/archive_status directory, which now results in the following traceback:

    Traceback (most recent call last):
      File "src/gevent/greenlet.py", line 716, in gevent._greenlet.Greenlet.run
      File "/var/lib/postgresql/.local/lib/python3.5/site-packages/wal_e/worker/upload.py", line 53, in __call__
        self.gpg_key_id)
      File "/var/lib/postgresql/.local/lib/python3.5/site-packages/wal_e/worker/worker_util.py", line 33, in do_lzop_put
        open(local_path, 'rb'), tf, gpg_key=gpg_key):
    IsADirectoryError: [Errno 21] Is a directory: '/etc/postgresql/9.6/main'
    2019-01-19T21:32:03Z <Greenlet "Greenlet-0" at 0x7f34f2c27a48: <wal_e.worker.upload.WalUploader object at 0x7f34f2bab470>(<wal_e.worker.pg.wal_transfer.WalSegment object at)> failed with IsADirectoryError
    
    wal_e.main   CRITICAL MSG: An unprocessed exception has avoided all error handling
            DETAIL: Traceback (most recent call last):
              File "/var/lib/postgresql/.local/lib/python3.5/site-packages/wal_e/cmd.py", line 666, in main
                concurrency=args.pool_size)
              File "/var/lib/postgresql/.local/lib/python3.5/site-packages/wal_e/operator/backup.py", line 283, in wal_archive
                group.join()
              File "/var/lib/postgresql/.local/lib/python3.5/site-packages/wal_e/worker/pg/wal_transfer.py", line 144, in join
                raise val
              File "src/gevent/greenlet.py", line 716, in gevent._greenlet.Greenlet.run
              File "/var/lib/postgresql/.local/lib/python3.5/site-packages/wal_e/worker/upload.py", line 53, in __call__
                self.gpg_key_id)
              File "/var/lib/postgresql/.local/lib/python3.5/site-packages/wal_e/worker/worker_util.py", line 33, in do_lzop_put
                open(local_path, 'rb'), tf, gpg_key=gpg_key):
            IsADirectoryError: [Errno 21] Is a directory: '/etc/postgresql/9.6/main'
    

    However, doing a touch 9.6/archive_status results in the following error:

    DETAIL: Traceback (most recent call last):
              File "/var/lib/postgresql/.local/lib/python3.5/site-packages/wal_e/cmd.py", line 666, in main
                concurrency=args.pool_size)
              File "/var/lib/postgresql/.local/lib/python3.5/site-packages/wal_e/operator/backup.py", line 273, in wal_archive
                other_segment = next(seg_stream)
              File "/var/lib/postgresql/.local/lib/python3.5/site-packages/wal_e/worker/pg/wal_transfer.py", line 70, in from_ready_archive_status
                statuses = os.listdir(status_dir)
            NotADirectoryError: [Errno 20] Not a directory: '/etc/postgresql/9.6/archive_status'
    

    I've set up my postgresql.conf exactly as instructed in the README. How should I go about fixing this?

    opened by adewolff 0
  • wal-e backup-push not working

    wal-e backup-push not working

    AWS_SECRET_ACCESS_KEY='xxx' AWS_REGION='ap-southeast-1' wal-e -k 'xxx' --s3-prefix=s3://xxx backup-push /usr/local/var/postgres/
    wal_e.main   INFO     MSG: starting WAL-E
            DETAIL: The subcommand is "backup-push".
            STRUCTURED: time=2019-01-09T07:09:48.871213-00 pid=4841
    wal_e.operator.backup INFO     MSG: start upload postgres version metadata
            DETAIL: Uploading to s3://dev-bsapbeat/dms_staging_db_backup/basebackups_005/base_0000000100000008000000F5_00000040/extended_version.txt.
            STRUCTURED: time=2019-01-09T07:09:49.327359-00 pid=4841
    wal_e.operator.backup INFO     MSG: postgres version metadata upload complete
            STRUCTURED: time=2019-01-09T07:09:49.638292-00 pid=4841
    wal_e.worker.upload INFO     MSG: beginning volume compression
            DETAIL: Building volume 0.
            STRUCTURED: time=2019-01-09T07:09:49.763460-00 pid=4841
    Traceback (most recent call last):
      File "src/gevent/greenlet.py", line 766, in gevent._greenlet.Greenlet.run
      File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/wal_e/worker/upload.py", line 97, in __call__
        tpart.tarfile_write(pl.stdin)
      File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/wal_e/tar_partition.py", line 324, in tarfile_write
        self._padded_tar_add(tar, et_info)
      File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/wal_e/tar_partition.py", line 243, in _padded_tar_add
        tar.addfile(et_info.tarinfo, f)
      File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/tarfile.py", line 1970, in addfile
        copyfileobj(fileobj, self.fileobj, tarinfo.size, bufsize=bufsize)
    TypeError: copyfileobj() got an unexpected keyword argument 'bufsize'
    2019-01-09T07:09:49Z <Greenlet at 0x10f581598: <wal_e.worker.upload.PartitionUploader object at 0x10f7b9470>([ExtendedTarInfo(submitted_path='/usr/local/var/po)> failed with TypeError
    
    wal_e.operator.backup WARNING  MSG: blocking on sending WAL segments
            DETAIL: The backup was not completed successfully, but we have to wait anyway.  See README: TODO about pg_cancel_backup
            STRUCTURED: time=2019-01-09T07:09:49.908770-00 pid=4841
    NOTICE:  WAL archiving is not enabled; you must ensure that all required WAL segments are copied through other means to complete the backup
    wal_e.main   CRITICAL MSG: An unprocessed exception has avoided all error handling
            DETAIL: Traceback (most recent call last):
              File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/wal_e/cmd.py", line 652, in main
                pool_size=args.pool_size)
              File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/wal_e/operator/backup.py", line 197, in database_backup
                **kwargs)
              File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/wal_e/operator/backup.py", line 504, in _upload_pg_cluster_dir
                pool.join()
              File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/wal_e/worker/upload_pool.py", line 120, in join
                self._wait()
              File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/wal_e/worker/upload_pool.py", line 65, in _wait
                raise val
              File "src/gevent/greenlet.py", line 766, in gevent._greenlet.Greenlet.run
              File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/wal_e/worker/upload.py", line 97, in __call__
                tpart.tarfile_write(pl.stdin)
              File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/wal_e/tar_partition.py", line 324, in tarfile_write
                self._padded_tar_add(tar, et_info)
              File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/wal_e/tar_partition.py", line 243, in _padded_tar_add
                tar.addfile(et_info.tarinfo, f)
              File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/tarfile.py", line 1970, in addfile
                copyfileobj(fileobj, self.fileobj, tarinfo.size, bufsize=bufsize)
            TypeError: copyfileobj() got an unexpected keyword argument 'bufsize'
    
            STRUCTURED: time=2019-01-09T07:09:50.049182-00 pid=4841
    
    opened by umsh1ume 1
  • Python Error: Permission denied: '/usr/local/lib/python3.5/dist-packages/urllib3-1.24.1.dist-info'

    Python Error: Permission denied: '/usr/local/lib/python3.5/dist-packages/urllib3-1.24.1.dist-info'

    I was following this to setup wal-e with my staging server. In my step5 (see blog), when I enter POSTGRES_VERSION=9.5 sudo -u postgres /usr/bin/envdir /etc/wal-e.d/env /usr/local/bin/wal-e backup-push /var/lib/postgresql/${POSTGRES_VERSION}/main, I get following error :- PermissionError: [Errno 13] Permission denied: '/usr/local/lib/python3.5/dist-packages/urllib3-1.24.1.dist-info' I think this have something to do with me installing modules for both python2 and python3. Please help

    opened by umsh1ume 0
  • Is the project not maintained?

    Is the project not maintained?

    anything else?

    opened by duanhongyi 2
🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...

ArchiveBox Open-source self-hosted web archiving. ▶️ Quickstart | Demo | Github | Documentation | Info & Motivation | Community | Roadmap "Your own pe

ArchiveBox 12.3k Dec 4, 2021
A command line tool (and Python library) for archiving Twitter JSON

A command line tool (and Python library) for archiving Twitter JSON

Documenting the Now 1.1k Nov 25, 2021
MLOps will help you to understand how to build a Continuous Integration and Continuous Delivery pipeline for an ML/AI project.

page_type languages products description sample python azure azure-machine-learning-service azure-devops Code which demonstrates how to set up and ope

null 1 Nov 1, 2021
Postgres CLI with autocompletion and syntax highlighting

A REPL for Postgres This is a postgres client that does auto-completion and syntax highlighting. Home Page: http://pgcli.com MySQL Equivalent: http://

dbcli 9.9k Dec 3, 2021
🍯 16 honeypots in a single pypi package (DNS, HTTP Proxy, HTTP, HTTPS, SSH, POP3, IMAP, STMP, VNC, SMB, SOCKS5, Redis, TELNET, Postgres & MySQL)

Easy to setup customizable honeypots for monitoring network traffic, bots activities and username\password credentials. The current available honeypot

QeeqBox 116 Nov 29, 2021
Postgres CLI with autocompletion and syntax highlighting

A REPL for Postgres This is a postgres client that does auto-completion and syntax highlighting. Home Page: http://pgcli.com MySQL Equivalent: http://

dbcli 9.9k Nov 26, 2021
The ormar package is an async mini ORM for Python, with support for Postgres, MySQL, and SQLite.

python async mini orm with fastapi in mind and pydantic validation

null 730 Dec 3, 2021
A basic CRUD application built in flask using postgres as database

flask-postgres-CRUD A basic CRUD application built in flask using postgres as database Taks list Dockerfile Initial docker-compose - It is working Dat

Pablo Emídio S.S 6 Nov 8, 2021
Dockerizing Django with Postgres, Gunicorn, Nginx and Certbot. A fully Django starter project.

Dockerizing Django with Postgres, Gunicorn, Nginx and Certbot ?? Features A Django stater project with fully basic requirements for a production-ready

null 5 Nov 29, 2021
A server shell for you to play with Powered by Django + Nginx + Postgres + Bootstrap + Celery.

A server shell for you to play with Powered by Django + Nginx + Postgres + Bootstrap + Celery.

Mengting Song 1 Nov 18, 2021
CLI utility for updating the EVE Online static data export in a postgres database

EVE SDE Postgres updater CLI utility for updating the EVE Online static data export postgres database. This has been tested with the Fuzzwork postgres

Markus Juopperi 1 Oct 29, 2021
a Scrapy spider that utilizes Postgres as a DB, Squid as a proxy server, Redis for de-duplication and Splash to render JavaScript. All in a microservices architecture utilizing Docker and Docker Compose

This is George's Scraping Project To get started cd into the theZoo file and run: chmod +x script.sh then: ./script.sh This will spin up a Postgres co

George Reyes 3 Oct 27, 2021
FastAPI + Postgres + Docker Compose + Heroku Deploy Template

FastAPI + Postgres + Docker Compose + Heroku Deploy ⚠️ For educational purpose only. Not ready for production use YET Features FastAPI with Postgres s

DP 6 Nov 28, 2021
Sample FastAPI project that uses async SQLAlchemy, SQLModel, Postgres, Alembic, and Docker.

FastAPI + SQLModel + Alembic Sample FastAPI project that uses async SQLAlchemy, SQLModel, Postgres, Alembic, and Docker. Want to learn how to build th

null 93 Nov 24, 2021
Task dispatcher for Postgres

Features a task being ran as an OS process supports task queue with priority and process limit per node fully database driven (a worker and task can b

null 2 Nov 18, 2021
MLFlow in a Dockercontainer based on Azurite and Postgres

mlflow-azurite-postgres docker This is a MLFLow image which works with a postgres DB and a local Azure Blob Storage Instance (Azurite). This image is

null 1 Nov 10, 2021
Simple CLI for managing Postgres databases in Flask.

Overview Simple CLI that provides the following commands: flask psql create flask psql init flask psql drop flask psql setup: create → init flask psql

Daniel Reeves 5 Nov 29, 2021
MiraiML: asynchronous, autonomous and continuous Machine Learning in Python

MiraiML Mirai: future in japanese. MiraiML is an asynchronous engine for continuous & autonomous machine learning, built for real-time usage. Usage In

Arthur Paulino 23 Sep 3, 2021
Learning Continuous Image Representation with Local Implicit Image Function

LIIF This repository contains the official implementation for LIIF introduced in the following paper: Learning Continuous Image Representation with Lo

Yinbo Chen 787 Dec 1, 2021
Plug and play continuous integration with django and jenkins

django-jenkins Plug and play continuous integration with Django and Jenkins Installation From PyPI: $ pip install django-jenkins Or by downloading th

Mikhail Podgurskiy 935 Nov 17, 2021
Local continuous test runner with pytest and watchdog.

pytest-watch -- Continuous pytest runner pytest-watch a zero-config CLI tool that runs pytest, and re-runs it when a file in your project changes. It

Joe Esposito 605 Nov 17, 2021
Python-based continuous integration testing framework; your pull requests are more than welcome!

Buildbot The Continuous Integration Framework Buildbot is based on original work from Brian Warner, and currently maintained by the Botherders. Visit

Buildbot 4.7k Nov 30, 2021
Was an interactive continuous Python profiler.

☠ This project is not maintained anymore. We highly recommend switching to py-spy which provides better performance and usability. Profiling The profi

What! Studio 3k Nov 29, 2021
Analytics service that is part of iter8. Robust analytics and control to unleash cloud-native continuous experimentation.

iter8-analytics iter8 enables statistically robust continuous experimentation of microservices in your CI/CD pipelines. For in-depth information about

null 16 Oct 14, 2021
Spinnaker is an open source, multi-cloud continuous delivery platform for releasing software changes with high velocity and confidence.

Welcome to the Spinnaker Project Spinnaker is an open-source continuous delivery platform for releasing software changes with high velocity and confid

null 8.2k Dec 3, 2021
[CVPR'21] FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space

FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space by Quande Liu, Cheng Chen, Ji

Quande Liu 109 Nov 29, 2021
Code for: Gradient-based Hierarchical Clustering using Continuous Representations of Trees in Hyperbolic Space. Nicholas Monath, Manzil Zaheer, Daniel Silva, Andrew McCallum, Amr Ahmed. KDD 2019.

gHHC Code for: Gradient-based Hierarchical Clustering using Continuous Representations of Trees in Hyperbolic Space. Nicholas Monath, Manzil Zaheer, D

Nicholas Monath 28 Nov 25, 2021
[ICML 2020] Prediction-Guided Multi-Objective Reinforcement Learning for Continuous Robot Control

PG-MORL This repository contains the implementation for the paper Prediction-Guided Multi-Objective Reinforcement Learning for Continuous Robot Contro

MIT Graphics Group 43 Nov 25, 2021
Distribution Analyser is a Web App that allows you to interactively explore continuous distributions from SciPy and fit distribution(s) to your data.

Distribution Analyser Distribution Analyser is a Web App that allows you to interactively explore continuous distributions from SciPy and fit distribu

Robert Dzudzar 34 Nov 26, 2021