A tool to snapshot sqlite databases you don't own

Overview

WORK IN PROGRESS

The core here is my first attempt at a solution of this, combining ideas from browser_history.py and karlicoss/HPI/sqlite.py to create a library/CLI tool to (as safely as possible) copy databases which may be in use from other applications

sqlite_backup

This exposes the python stdlib sqlite.backup function as a library, with a couple extra steps.

The main purpose for writing this is to copy sqlite databases that you may not own -- perhaps it belongs to an application (your browser) and is locked since that's currently open, or the OS keeps it open while the computer is active (e.g. Mac with iMessage)

Features

  • Has the option (true by default) to first safely copy the database from disk to a temporary directory, which is:
    • useful in case the source is in read-only mode (e.g. in some sort of docker container)
    • safer if you're especially worried about corrupting or losing data
  • Uses Cpythons Connection.backup, which directly uses the underlying Sqlite C code

In short, this prioritizes safety of the data over performance, temporarily copied data files to /tmp or memory usage - because we often don't know what the application may be doing while we're copying underlying sqlite databases

This was extracted out of the karlicoss/HPI sqlite module

If other tools exist to do this, please let me know!

Installation

Requires python3.7+

To install with pip, run:

pip install sqlite_backup

Usage

TODO: Fill this out

Usage: ...

Tests

git clone 'https://github.com/seanbreckenridge/sqlite_backup'
cd ./sqlite_backup
pip install '.[testing]'
mypy ./sqlite_backup
pytest
Comments
  • mac os: leftover wal/shm files

    mac os: leftover wal/shm files

    Been running browserexport and noticed that it was leaving some wal files in the backup directory.

    Tried running sqlite_backup tests and some actually fail.

    FAILED tests/test_sqlite_backup.py::test_copy_to_another_file - AssertionError: assert {PosixPath('/....sqlite-wal')} == {PosixPath('/...0/db.sqlite')}
    FAILED tests/test_sqlite_backup.py::test_backup_with_checkpoint - AssertionError: assert {PosixPath('/....sqlite-wal')} == {PosixPath('/...0/db.sqlite')}
    FAILED tests/test_sqlite_backup.py::test_backup_without_checkpoint - AssertionError: assert {PosixPath('/....sqlite-wal')} == {PosixPath('/...0/db.sqlite')}
    PASSED tests/test_sqlite_backup.py::test_open_asis
    PASSED tests/test_sqlite_backup.py::test_do_copy
    PASSED tests/test_sqlite_backup.py::test_do_immutable
    PASSED tests/test_sqlite_backup.py::test_no_copy_use_tempdir
    PASSED tests/test_sqlite_backup.py::test_do_copy_and_open
    PASSED tests/test_sqlite_backup.py::test_database_doesnt_exist
    PASSED tests/test_sqlite_backup.py::test_copy_retry_strict
    PASSED tests/test_sqlite_backup.py::test_copy_different_source_and_dest
    PASSED tests/test_threads.py::test_thread_wrapper_none
    PASSED tests/test_threads.py::test_thread_wrapper_has
    PASSED tests/test_threads.py::test_thread_raises
    

    Will try to investigate and pehraps add a mac os CI pipeline, but for now just leaving it here

    opened by karlicoss 16
  • workaround for empty wal/shm turds on some macos systems

    workaround for empty wal/shm turds on some macos systems

    Sadly this doesn't happen on github actions (perhaps because of different macos version), so can't cover with a test.

    Most likely caused by a different default for SQLITE_FCNTL_PERSIST_WAL flag.

    More context/discussion here: https://github.com/seanbreckenridge/sqlite_backup/issues/9

    opened by karlicoss 1
  • v0.1.2: add logs, bugfix, more tests

    v0.1.2: add logs, bugfix, more tests

    • logs can be controlled with SQLITE_BACKUP_LOGS or --debug flag
    • set copy_retry_strict to True by default
    • bugfix: since testing for the succeeded/failed (whether the file copied) boolean was flipped, this was copying the file more times than necessary
    • cover current shm/wal behavior in tests, lots of comments to describe current situation
    opened by seanbreckenridge 1
  • add tests for attempting for copy_retry_strict

    add tests for attempting for copy_retry_strict

    Probably need to patch filecmp.cmp to force this to fail multiple times, so that the very unlikely case that this fails to copy the files 100 times in a row can be tested

    opened by seanbreckenridge 1
  • CLI interface

    CLI interface

    likely the most common way to use this -- in scripts to copy an application database to some backup directory

    like I do here:

    https://github.com/seanbreckenridge/HPI/blob/master/jobs/all/backup_ipython.job

    and in the browserexport save command:

    https://github.com/seanbreckenridge/browserexport/blob/master/browserexport/save.py

    opened by seanbreckenridge 1
  • check source != destination

    check source != destination

    make sure source != destination, when copying to a file. This exposed some problems with the tests which could then be fixed; was using a pytest test-scoped fixture instead of a function-scoped

    opened by seanbreckenridge 0
Application which allows you to make PostgreSQL databases with Python

Automate PostgreSQL Databases with Python Application which allows you to make PostgreSQL databases with Python I used the psycopg2 library which is u

Marc-Alistair Coffi 0 Dec 31, 2021
A simple Python tool to transfer data from MySQL to SQLite 3.

MySQL to SQLite3 A simple Python tool to transfer data from MySQL to SQLite 3. This is the long overdue complimentary tool to my SQLite3 to MySQL. It

Klemen Tusar 126 Jan 3, 2023
dbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL databases.

dbd: database prototyping tool dbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL d

Zdenek Svoboda 47 Dec 7, 2022
A supercharged SQLite library for Python

SuperSQLite: a supercharged SQLite library for Python A feature-packed Python package and for utilizing SQLite in Python by Plasticity. It is intended

Plasticity 703 Dec 30, 2022
a small, expressive orm -- supports postgresql, mysql and sqlite

peewee Peewee is a simple and small ORM. It has few (but expressive) concepts, making it easy to learn and intuitive to use. a small, expressive ORM p

Charles Leifer 9.7k Dec 30, 2022
A wrapper for SQLite and MySQL, Most of the queries wrapped into commands for ease.

Before you proceed, make sure you know Some real SQL, before looking at the code, otherwise you probably won't understand anything. Installation pip i

Refined 4 Jul 30, 2022
Create a database, insert data and easily select it with Sqlite

sqliteBasics create a database, insert data and easily select it with Sqlite Watch on YouTube a step by step tutorial explaining this code: https://yo

Mariya 27 Dec 27, 2022
aioodbc - is a library for accessing a ODBC databases from the asyncio

aioodbc aioodbc is a Python 3.5+ module that makes it possible to access ODBC databases with asyncio. It relies on the awesome pyodbc library and pres

aio-libs 253 Dec 31, 2022
db.py is an easier way to interact with your databases

db.py What is it Databases Supported Features Quickstart - Installation - Demo How To Contributing TODO What is it? db.py is an easier way to interact

yhat 1.2k Jan 3, 2023
Anomaly detection on SQL data warehouses and databases

With CueObserve, you can run anomaly detection on data in your SQL data warehouses and databases. Getting Started Install via Docker docker run -p 300

Cuebook 171 Dec 18, 2022
A selection of SQLite3 databases to practice querying from.

Dummy SQL Databases This is a collection of dummy SQLite3 databases, for learning and practicing SQL querying, generated with the VS Code extension Ge

null 1 Feb 26, 2022
Estoult - a Python toolkit for data mapping with an integrated query builder for SQL databases

Estoult Estoult is a Python toolkit for data mapping with an integrated query builder for SQL databases. It currently supports MySQL, PostgreSQL, and

halcyon[nouveau] 15 Dec 29, 2022
Apache Libcloud is a Python library which hides differences between different cloud provider APIs and allows you to manage different cloud resources through a unified and easy to use API

Apache Libcloud - a unified interface for the cloud Apache Libcloud is a Python library which hides differences between different cloud provider APIs

The Apache Software Foundation 1.9k Dec 25, 2022
PyRemoteSQL is a python SQL client that allows you to connect to your remote server with phpMyAdmin installed.

PyRemoteSQL Python MySQL remote client Basically this is a python SQL client that allows you to connect to your remote server with phpMyAdmin installe

ProbablyX 3 Nov 4, 2022
Tool for synchronizing clickhouse clusters

clicksync Tool for synchronizing clickhouse clusters works only with partitioned MergeTree tables can sync clusters with different node number uses in

Alexander Rumyantsev 1 Nov 30, 2021
Dlsite-doujin-renamer - Dlsite doujin renamer tool with python

dlsite-doujin-renamer Features 支持深度查找带有 RJ 号的文件夹 支持手动选择文件夹或拖拽文件夹到软件窗口 支持在 config

null 111 Jan 2, 2023
This will help you study and avoid moving mouse coz u dont need mouse for watching youtube videos

This will help you study and avoid moving mouse coz u dont need mouse for watching youtube videos. Neither u need it for pdfs just use your keyboard

KorryKatti 5 Jan 7, 2022
Crud-python-sqlite: used to manage telephone contacts through python and sqlite

crud-python-sqlite This program is used to manage telephone contacts through python and sqlite. Dependencicas python3 sqlite3 Installation Clone the r

Luis Negrón 0 Jan 24, 2022
Google-drive-to-sqlite - Create a SQLite database containing metadata from Google Drive

google-drive-to-sqlite Create a SQLite database containing metadata from Google

Simon Willison 140 Dec 4, 2022
CLI for SQLite Databases with auto-completion and syntax highlighting

litecli Docs A command-line client for SQLite databases that has auto-completion and syntax highlighting. Installation If you already know how to inst

dbcli 1.8k Dec 31, 2022