skimpy
Welcome
Welcome to skimpy! skimpy is a light weight tool that provides summary statistics about variables in data frames within the console. Think of it as a super version of df.summary().
Quickstart
skim a dataframe and produce summary statistics within the console using:
from skimpy import skim
skim(df)
If you need to a dataset to try skimpy out on, you can use the built-in test dataframe:
from skimpy import skim, generate_test_data
df = generate_test_data()
skim(df)
It is recommended that you set your datatypes before using skimpy (for example converting any text columns to pandas string datatype), as this will produce richer statistical summaries.
You can try this package out right now in your browser using a Google Colab notebook (requires a Google account).
Features
- Support for boolean, numeric, datetime, string, and category datatypes
- Command line interface in addition to interactive console functionality
- Light weight, with results printed to terminal using the rich package.
Requirements
You can find a full list of requirements in the pyproject.toml file. The main requirements are:
- python = ">=3.7.1,<4.0.0"
- click = "^8.0.1"
- rich = "^10.9.0"
- pandas = "^1.3.2"
Installation
You can install the latest release of skimpy via pip from PyPI:
$ pip install skimpy
To install the development version from git, use:
$ pip install git+https://github.com/aeturrell/skimpy.git
For development, see the Contributor Guide.
Usage
This package is mostly designed to be used within an interactive console session or Jupyter notebook
from skimpy import skim
skim(df)
However, you can also use it on the command line:
$ skimpy file.csv
skimpy will do its best to infer column datatypes.
Contributing
Contributions are very welcome. To learn more, see the Contributor Guide.
License
Distributed under the terms of the MIT license, skimpy is free and open source software.
Issues
If you encounter any problems, please file an issue along with a detailed description.
Credits
This project was generated from @cjolowicz's Hypermodern Python Cookiecutter template.
skimpy was inspired by the R package skimr and by exploratory Python packages including pandas_profiling and dataprep.