Mypy stubs, i.e., type information, for numpy, pandas and matplotlib

Predictive Analytics Lab

Last update: Dec 19, 2022

Related tags

Linters & Style Checkers python numpy pandas matplotlib stubs mypy mypy-stubs type-stubs

Overview

Mypy type stubs for NumPy, pandas, and Matplotlib

This is a PEP-561-compliant stub-only package which provides type information for matplotlib, numpy and pandas. The mypy type checker (or pytype or PyCharm) can recognize the types in these packages by installing this package.

NOTE: This is a work in progress

Many functions are already typed, but a lot is still missing (NumPy and pandas are huge libraries). Chances are, you will see a message from Mypy claiming that a function does not exist when it does exist. If you encounter missing functions, we would be delighted for you to send a PR. If you are unsure of how to type a function, we can discuss it.

Installing

You can get this package from PyPI:

pip install data-science-types

To get the most up-to-date version, install it directly from GitHub:

pip install git+https://github.com/predictive-analytics-lab/data-science-types

Or clone the repository somewhere and do pip install -e ..

Examples

These are the kinds of things that can be checked:

Array creation

import numpy as np

arr1: np.ndarray[np.int64] = np.array([3, 7, 39, -3])  # OK
arr2: np.ndarray[np.int32] = np.array([3, 7, 39, -3])  # Type error
arr3: np.ndarray[np.int32] = np.array([3, 7, 39, -3], dtype=np.int32)  # OK
arr4: np.ndarray[float] = np.array([3, 7, 39, -3], dtype=float)  # Type error: the type of ndarray can not be just "float"
arr5: np.ndarray[np.float64] = np.array([3, 7, 39, -3], dtype=float)  # OK

Operations

import numpy as np

arr1: np.ndarray[np.int64] = np.array([3, 7, 39, -3])
arr2: np.ndarray[np.int64] = np.array([4, 12, 9, -1])

result1: np.ndarray[np.int64] = np.divide(arr1, arr2)  # Type error
result2: np.ndarray[np.float64] = np.divide(arr1, arr2)  # OK

compare: np.ndarray[np.bool_] = (arr1 == arr2)

Reductions

import numpy as np

arr: np.ndarray[np.float64] = np.array([[1.3, 0.7], [-43.0, 5.6]])

sum1: int = np.sum(arr)  # Type error
sum2: np.float64 = np.sum(arr)  # OK
sum3: float = np.sum(arr)  # Also OK: np.float64 is a subclass of float
sum4: np.ndarray[np.float64] = np.sum(arr, axis=0)  # OK

# the same works with np.max, np.min and np.prod

Philosophy

The goal is not to recreate the APIs exactly. The main goal is to have useful checks on our code. Often the actual APIs in the libraries is more permissive than the type signatures in our stubs; but this is (usually) a feature and not a bug.

Contributing

We always welcome contributions. All pull requests are subject to CI checks. We check for compliance with Mypy and that the file formatting conforms to our Black specification.

You can install these dev dependencies via

pip install -e '.[dev]'

This will also install NumPy, pandas, and Matplotlib to be able to run the tests.

Running CI locally (recommended)

We include a script for running the CI checks that are triggered when a PR is opened. To test these out locally, you need to install the type stubs in your environment. Typically, you would do this with

pip install -e .

Then use the check_all.sh script to run all tests:

./check_all.sh

Below we describe how to run the various checks individually, but check_all.sh should be easier to use.

Checking compliance with Mypy

The settings for Mypy are specified in the mypy.ini file in the repository. Just running

mypy tests

from the base directory should take these settings into account. We enforce 0 Mypy errors.

Formatting with black

We use Black to format the stub files. First, install black and then run

black .

from the base directory.

Pytest

python -m pytest -vv tests/

Flake8

flake8 *-stubs

License

Apache 2.0

Comments

Update pandas read_csv and to_csv

Hey! I updated pandas read_csv and to _csv, and also a small fix to pandas.Series (map function)

There are some small changes made by black formatter, (was it bad formatted before or did I hve something wrong in my settings?)

I would appreciate it if you could review this.

opened by hellocoldworld 9
Support str and int as dtypes.

Extend the set of dtypes with the str and int literals.

Note -- it would help to add some comments to describe the intended use of the _Dtype types -- it was hard for me to guess if I needed to also extend any of these.

Fix for #73

opened by rpgoldman 9
Add Series.sort_index() signature
[x] Adds Series.sort_index based on stable version documentation

[x] Fixes wrong order of arguments in Series.sort_values() (ascending should go before inplace)

[x] Adds missing arguments to Series.sort_values()
opened by krassowski 8
Small additions to DataFrame and Series

I've made a few more additions to the stub, fleshing it out as I found I needed more for my work. I've corrected the issue I found - thanks again, thomkeh! - and hope that others can benefit from this work.

Thank you!

opened by ZHSimon 8
Flesh out pandas and numpy a bit more
This is the result of testing data-science-types against another project I contribute to: https://github.com/jldbc/pybaseball

I added some common .gitignores for venv and vscode

I found a few Pandas tweaks to support functions and parameters that we are using.

Tweak DataFrame.apply, DataFrame.drop, DataFrame.merge, DataFrame.rank, DataFrame.reindex, DataFrame.replace

Add DataFrame.assign, DataFrame.filter

Tweak Series.rank

Add pandas.isnull

Tweak DataFrame.loc

A few changes to numpy as well

Allow tuples -> numpy.array

Tweak numpy.setdiff1d

Add numpy.cos, numpy.deg2rad, numpy.sin, numpy.cos

Everything was done using the latest Pandas docs for reference to data types:

https://pandas.pydata.org/pandas-docs/stable/reference/

I also did my best to add tests to support the changes as well
opened by TheCleric 7
Shelelena/pandas improvements
Improvements in pandas DataFrame, DataFrameGroupBy and SeriesGroupBy

specify DataFrame.groupby

add DataFrameGroupBy.aggregate

adjust data type in DataFrame.__init__

add __getattr__ to get columns in DataFrame and DataFrameGroupBy

correct return type of DataFrameGroupBy.__getitem__

add some missing statistical methods to DataFrameGroupBy and SeriesGroupBy
opened by Shelelena 7
Fix numpy arange overload

Change order of start/stop to comply with numpy documentation (https://numpy.org/doc/stable/reference/generated/numpy.arange.html) and change data types to float.

This is my first ever PR on github for a public repository, so please be gentle. If I need to clear anything up, please let me know.

opened by Hvlot 5
Missing pandas.to_numeric

The pandas stubs are missing pandas.to_numeric.

I would like to do a PR but I'm not really sure where to start or how to write proper type hints for this, as I've only just started learning about python typing for the last few days. Any help would be much appreciated.

opened by wwuck 5
Add support for Series and DF at methods

Created _AtIndexer classes for Series and DataFrame and used them to type the corresponding at() methods.

Partial solution to #74.

This doesn't fully work, because it doesn't handle the possibility that a data frame will contain categorical (string) or integer data, instead of just float. I don't know how to do this.

opened by rpgoldman 5
Add support for pandas.IntDtypes and pandas.UIntDtypes
This adds support for pandas:

Int8Dtype

Int16Dtype

Int32DType

Int64Dtype

UInt8Dtype

UInt16Dtype

UInt32DType

UInt64Dtype

As well as a slew of base classes.

We'll see how the CI likes it, but for some reason on my local machine, mypy is saying it can't find any of the types, when they have been clearly added to the __init__.pyi in the pandas-stubs root.

If that continues to be a problem, I may need advice on how to fix.
opened by TheCleric 4
Will numpy stubs be removed after next numpy release?

Numpy has finally merged the stubs from numpy-stubs into the main numpy project.

https://github.com/numpy/numpy-stubs/pull/88 https://github.com/numpy/numpy/pull/16515

Will the numpy stubs in this project be removed when numpy 1.20.0 is released?

opened by wwuck 4
No overload variant of "subplots" matches argument type "bool"
When I perform the following:

from matplotlib.pyplot import subplots FIG, AXES = subplots(constrained_layout=True)

I get the warning:

No overload variant of "subplots" matches argument type "bool".

Does that need to be added?
opened by uihsnv 0

test_frame_iloc fails on Pandas 1.2

tests/pandas_test.py line 92 fails on Pandas 1.2

Extracting the relevant code

import pandas as pd
df: pd.DataFrame = pd.DataFrame(
    [[1.0, 2.0], [4.0, 5.0], [7.0, 8.0]],
    index=["cobra", "viper", "sidewinder"],
    columns=["max_speed", "shield"],
)
s: "pd.Series[float]" = df["shield"].copy()
df.iloc[0] = s

Results in

ValueError: could not broadcast input array from shape (3) into shape (2)

This runs fine on Pandas 1.1.5

opened by EdwardJRoss 0

Pandas `DataFrame.concat` missing some arguments

The concat method for joining multiple DataFrames appears to be missing several arguments, such as join, keys, levels, and more.

https://github.com/predictive-analytics-lab/data-science-types/blob/faebf595b16772d3aa70d56ea179a2eaffdbd565/pandas-stubs/init.pyi#L37-L42

Compare to the Pandas docs: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.concat.html

opened by kevinhu 0

Releases(v0.2.23)

v0.2.23(Feb 16, 2021)
Changes

improved README

more type stubs

Source code(tar.gz)
Source code(zip)
v0.2.22(Dec 15, 2020)
Changes

add MANIFEST.in file

Source code(tar.gz)
Source code(zip)
v0.2.21(Dec 3, 2020)

Our great contributors have worked tirelessly to bring you this release.
Source code(tar.gz)
Source code(zip)
v0.2.20(Nov 5, 2020)

Changes

More stubs.
Source code(tar.gz)
Source code(zip)
v0.2.19(Oct 11, 2020)

As always, more type stubs have been added.
Source code(tar.gz)
Source code(zip)
v0.2.18(Sep 17, 2020)
Changes

license change

more pandas stubs

Source code(tar.gz)
Source code(zip)
v0.2.17(Aug 10, 2020)

Changes

A few small things.
Source code(tar.gz)
Source code(zip)
v0.2.16(Jun 29, 2020)
Changes

fix numpy scalar hierarchy

Source code(tar.gz)
Source code(zip)
v0.2.15(Jun 24, 2020)
Changes

Improvements in pandas DataFrame, DataFrameGroupBy and SeriesGroupBy

specify DataFrame.groupby

add DataFrameGroupBy.aggregate

adjust data type in DataFrame.__init__

add __getattr__ to get columns in DataFrame and DataFrameGroupBy

correct return type of DataFrameGroupBy.__getitem__

add some missing statistical methods to DataFrameGroupBy and SeriesGroupBy

Source code(tar.gz)
Source code(zip)
v0.2.12(May 21, 2020)

Source code(tar.gz)
Source code(zip)
v0.2.7(Apr 2, 2020)
Changes

add MultiIndex

Source code(tar.gz)
Source code(zip)
v0.2.6(Mar 13, 2020)
Changes

more functions

Source code(tar.gz)
Source code(zip)
v0.2.5.post1(Feb 17, 2020)
Changes

this release is for testing the CI

Source code(tar.gz)
Source code(zip)
v0.2.5(Feb 17, 2020)
Changes

more functions for pandas

improvement of .loc and .iloc

Source code(tar.gz)
Source code(zip)
v0.2.4(Feb 5, 2020)

More functions. Mostly from pandas.
Source code(tar.gz)
Source code(zip)
v0.2.3(Jan 22, 2020)

Minor updates to pandas Dataframe, Series and Index
Source code(tar.gz)
Source code(zip)
v0.2.2(Jan 21, 2020)
Changes

More functions; nothing major

Source code(tar.gz)
Source code(zip)
v0.2.1(Jan 8, 2020)
Changes

add some more convenient overloads

Source code(tar.gz)
Source code(zip)
v0.2.0(Jan 7, 2020)
Changes

thanks to improved testing, a ton of errors were fixed for this version

the numeric types in numpy now form a hierarchy; hopefully this doesn't lead to problems

as always, lots of functions were added

Source code(tar.gz)
Source code(zip)
v0.1.6(Dec 9, 2019)
Changes

Fixed the behavior of np.max(), np.min(), np.sum() and np.prod()

Source code(tar.gz)
Source code(zip)
v0.1.5(Dec 6, 2019)
Changes

changed the name of the package in setup.py to fit the new name

added more tests and organized them better

fix some problems with division and astype which the new tests exposed

Source code(tar.gz)
Source code(zip)
v0.1.4(Dec 6, 2019)
Changes

the name has been changed to "DataScienceTypes"

added lots of numpy functions

refined the generic type of ndarray

Source code(tar.gz)
Source code(zip)
v0.1.3(Dec 2, 2019)
Changes

undo making the generic type of ndarray covariant; it is invariant again

Source code(tar.gz)
Source code(zip)
v0.1.2(Nov 28, 2019)
Changes

make the generic type of ndarray covariant

add various missing functions

Source code(tar.gz)
Source code(zip)
v0.1.1(Oct 2, 2019)
Changes

change module structure of pandas stubs

track the type of numpy ndarrays more precisely

add various missing functions

Source code(tar.gz)
Source code(zip)
v0.1.0(Sep 23, 2019)

This is our first release. It's certainly not perfect yet, but it is usable.
Source code(tar.gz)
Source code(zip)

Owner

Predictive Analytics Lab

GitHub

PEP-484 typing stubs for SQLAlchemy 1.4 and SQLAlchemy 2.0

SQLAlchemy 2 Stubs These are PEP-484 typing stubs for SQLAlchemy 1.4 and 2.0. They are released concurrently along with a Mypy extension which is desi

139 Dec 30, 2022

Pymxs, the 3DsMax bindings of Maxscript to Python doesn't come with any stubs

PyMXS Stubs generator What Pymxs, the 3DsMax bindings of Maxscript to Python doe

19 Dec 27, 2022

A plugin for flake8 integrating Mypy.

flake8-mypy NOTE: THIS PROJECT IS DEAD It was created in early 2017 when Mypy performance was often insufficient for in-editor linting. The Flake8 plu

103 Jun 23, 2022

A plugin for Flake8 that checks pandas code

pandas-vet pandas-vet is a plugin for flake8 that provides opinionated linting for pandas code. It began as a project during the PyCascades 2019 sprin

146 Dec 28, 2022

Performant type-checking for python.

Pyre is a performant type checker for Python compliant with PEP 484. Pyre can analyze codebases with millions of lines of code incrementally – providi

6.2k Jan 4, 2023

A static type analyzer for Python code

pytype - ?? ✔ Pytype checks and infers types for your Python code - without requiring type annotations. Pytype can: Lint plain Python code, flagging c

4k Dec 31, 2022

Static type checker for Python

Static type checker for Python Speed Pyright is a fast type checker meant for large Python source bases. It can run in a “watch” mode and performs fas

9.2k Jan 3, 2023

Unbearably fast O(1) runtime type-checking in pure Python.

Look for the bare necessities, the simple bare necessities. Forget about your worries and your strife. — The Jungle Book.

1.4k Jan 1, 2023

A plugin for Flake8 finding likely bugs and design problems in your program. Contains warnings that don't belong in pyflakes and pycodestyle.

flake8-bugbear A plugin for Flake8 finding likely bugs and design problems in your program. Contains warnings that don't belong in pyflakes and pycode

869 Dec 30, 2022

Optional static typing for Python 3 and 2 (PEP 484)

Mypy: Optional Static Typing for Python Got a question? Join us on Gitter! We don't have a mailing list; but we are always happy to answer questions o

14.4k Jan 8, 2023

The strictest and most opinionated python linter ever!

wemake-python-styleguide Welcome to the strictest and most opinionated python linter ever. wemake-python-styleguide is actually a flake8 plugin with s

2.1k Jan 1, 2023

coala provides a unified command-line interface for linting and fixing all your code, regardless of the programming languages you use.

"Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live." ― John F. Woods coala provides a

3.4k Dec 29, 2022

Automated security testing using bandit and flake8.

flake8-bandit Automated security testing built right into your workflow! You already use flake8 to lint all your code for errors, ensure docstrings ar

96 Jan 1, 2023

Easy saving and switching between multiple KDE configurations.

Konfsave Konfsave is a config manager. That is, it allows you to save, back up, and easily switch between different (per-user) system configurations.

42 Sep 25, 2022

A framework for detecting, highlighting and correcting grammatical errors on natural language text.

Gramformer Human and machine generated text often suffer from grammatical and/or typographical errors. It can be spelling, punctuation, grammatical or

1.3k Jan 8, 2023

Utilities for pycharm code formatting (flake8 and black)

Pycharm External Tools Extentions to Pycharm code formatting tools. Currently supported are flake8 and black on a selected code block. Usage Flake8 [P

13 Nov 3, 2022

open source tools to generate mypy stubs from protobufs

mypy-protobuf: Generate mypy stub files from protobuf specs We just released a new major release mypy-protobuf 2. on 02/02/2021! It includes some back

527 Jan 3, 2023

Crab is a ﬂexible, fast recommender engine for Python that integrates classic information ﬁltering recommendation algorithms in the world of scientiﬁc Python packages (numpy, scipy, matplotlib).

Crab - A Recommendation Engine library for Python Crab is a ﬂexible, fast recommender engine for Python that integrates classic information ﬁltering r

1.2k Dec 21, 2022

Re-apply type annotations from .pyi stubs to your codebase.

retype Re-apply type annotations from .pyi stubs to your codebase. Usage Usage: retype [OPTIONS] [SRC]... Re-apply type annotations from .pyi stubs

131 Nov 17, 2022

This tool for beginner and help those people they gather information about Email Header Analysis, Instagram Information, Instagram Username Check, Ip Information, Phone Number Information, Port Scan

This tool for beginner and help those people they gather information about Email Header Analysis, Instagram Information, Instagram Username Check, Ip Information, Phone Number Information, Port Scan. This tool shows your hostname and public IP first, then user give input and according to option this tool work. This tool work diffrent Oprating system.

5 Feb 18, 2022

Mypy stubs, i.e., type information, for numpy, pandas and matplotlib

Related tags

Overview

Mypy type stubs for NumPy, pandas, and Matplotlib

NOTE: This is a work in progress

Installing

Examples

Array creation

Operations

Reductions

Philosophy

Contributing

Running CI locally (recommended)

Checking compliance with Mypy

Formatting with black

Pytest

Flake8

License

Comments

Releases(v0.2.23)

v0.2.23(Feb 16, 2021)

Changes

v0.2.22(Dec 15, 2020)

Changes

v0.2.21(Dec 3, 2020)

v0.2.20(Nov 5, 2020)

Changes

v0.2.19(Oct 11, 2020)

v0.2.18(Sep 17, 2020)

Changes

v0.2.17(Aug 10, 2020)

Changes

v0.2.16(Jun 29, 2020)

Changes

v0.2.15(Jun 24, 2020)

Changes

v0.2.12(May 21, 2020)

v0.2.7(Apr 2, 2020)

Changes

v0.2.6(Mar 13, 2020)

Changes

v0.2.5.post1(Feb 17, 2020)

Changes

v0.2.5(Feb 17, 2020)

Changes

v0.2.4(Feb 5, 2020)

v0.2.3(Jan 22, 2020)

v0.2.2(Jan 21, 2020)

Changes

v0.2.1(Jan 8, 2020)

Changes

v0.2.0(Jan 7, 2020)

Changes

v0.1.6(Dec 9, 2019)

Changes

v0.1.5(Dec 6, 2019)

Changes

v0.1.4(Dec 6, 2019)

Changes

v0.1.3(Dec 2, 2019)

Changes

v0.1.2(Nov 28, 2019)

Changes

v0.1.1(Oct 2, 2019)

Changes

v0.1.0(Sep 23, 2019)

Owner

Predictive Analytics Lab

PEP-484 typing stubs for SQLAlchemy 1.4 and SQLAlchemy 2.0

Pymxs, the 3DsMax bindings of Maxscript to Python doesn't come with any stubs

A plugin for flake8 integrating Mypy.

A plugin for Flake8 that checks pandas code

Performant type-checking for python.

A static type analyzer for Python code

Static type checker for Python

Unbearably fast O(1) runtime type-checking in pure Python.

A plugin for Flake8 finding likely bugs and design problems in your program. Contains warnings that don't belong in pyflakes and pycodestyle.

Optional static typing for Python 3 and 2 (PEP 484)

The strictest and most opinionated python linter ever!

coala provides a unified command-line interface for linting and fixing all your code, regardless of the programming languages you use.