Bottleneck
Bottleneck is a collection of fast, NaN-aware NumPy array functions written in C.
As one example, to check if a np.array
has any NaNs using numpy, one must call np.any(np.isnan(array))
. The :meth:`bottleneck.anynan` function interleaves the :meth:`np.isnan` check with :meth:`np.any` pre-exit, enabling up to an O(N)
speedup relative to numpy.
Bottleneck strives to be a drop-in accelerator for NumPy functions. When using the following libraries, Bottleneck support is automatically enabled and utilized:
Details on the performance benefits can be found in :ref:`benchmarking`
Example
Let's give it a try. Create a NumPy array:
>>> import numpy as np >>> a = np.array([1, 2, np.nan, 4, 5])
Find the nanmean:
>>> import bottleneck as bn >>> bn.nanmean(a) 3.0
Moving window mean:
>>> bn.move_mean(a, window=2, min_count=1) array([ 1. , 1.5, 2. , 4. , 4.5])
Benchmark
Bottleneck comes with a benchmark suite:
>>> bn.bench() Bottleneck performance benchmark Bottleneck 1.3.0.dev0+122.gb1615d7; Numpy 1.16.4 Speed is NumPy time divided by Bottleneck time NaN means approx one-fifth NaNs; float64 used no NaN no NaN NaN no NaN NaN (100,) (1000,1000)(1000,1000)(1000,1000)(1000,1000) axis=0 axis=0 axis=0 axis=1 axis=1 nansum 29.7 1.4 1.6 2.0 2.1 nanmean 99.0 2.0 1.8 3.2 2.5 nanstd 145.6 1.8 1.8 2.7 2.5 nanvar 138.4 1.8 1.8 2.8 2.5 nanmin 27.6 0.5 1.7 0.7 2.4 nanmax 26.6 0.6 1.6 0.7 2.5 median 120.6 1.3 4.9 1.1 5.7 nanmedian 117.8 5.0 5.7 4.8 5.5 ss 13.2 1.2 1.3 1.5 1.5 nanargmin 66.8 5.5 4.8 3.5 7.1 nanargmax 57.6 2.9 5.1 2.5 5.3 anynan 10.2 0.3 52.3 0.8 41.6 allnan 15.1 196.0 156.3 135.8 111.2 rankdata 45.9 1.2 1.2 2.1 2.1 nanrankdata 50.5 1.4 1.3 2.4 2.3 partition 3.3 1.1 1.6 1.0 1.5 argpartition 3.4 1.2 1.5 1.1 1.6 replace 9.0 1.5 1.5 1.5 1.5 push 1565.6 5.9 7.0 13.0 10.9 move_sum 2159.3 31.1 83.6 186.9 182.5 move_mean 6264.3 66.2 111.9 361.1 246.5 move_std 8653.6 86.5 163.7 232.0 317.7 move_var 8856.0 96.3 171.6 267.9 332.9 move_min 1186.6 13.4 30.9 23.5 45.0 move_max 1188.0 14.6 29.9 23.5 46.0 move_argmin 2568.3 33.3 61.0 49.2 86.8 move_argmax 2475.8 30.9 58.6 45.0 82.8 move_median 2236.9 153.9 151.4 171.3 166.9 move_rank 847.1 1.2 1.4 2.3 2.6
You can also run a detailed benchmark for a single function using, for example, the command:
>>> bn.bench_detailed("move_median", fraction_nan=0.3)
Only arrays with data type (dtype) int32, int64, float32, and float64 are accelerated. All other dtypes result in calls to slower, unaccelerated functions. In the rare case of a byte-swapped input array (e.g. a big-endian array on a little-endian operating system) the function will not be accelerated regardless of dtype.
Where
download | https://pypi.python.org/pypi/Bottleneck |
docs | https://bottleneck.readthedocs.io |
code | https://github.com/pydata/bottleneck |
mailing list | https://groups.google.com/group/bottle-neck |
License
Bottleneck is distributed under a Simplified BSD license. See the LICENSE file and LICENSES directory for details.
Install
Requirements:
Bottleneck | Python 3.6, 3.7, 3.8; NumPy 1.15.0+ (follows NEP 29) |
Compile | gcc, clang, MinGW or MSVC |
Unit tests | pytest, hypothesis |
Documentation | sphinx, numpydoc |
Detailed installation instructions can be found at :ref:`installing`
To install Bottleneck on Linux, Mac OS X, et al.:
$ pip install .
To install bottleneck on Windows, first install MinGW and add it to your system path. Then install Bottleneck with the command:
python setup.py install --compiler=mingw32
Alternatively, you can use the Windows binaries created by Christoph Gohlke: http://www.lfd.uci.edu/~gohlke/pythonlibs/#bottleneck
Unit tests
To keep the install dependencies light, test dependencies are made available via a setuptools "extra":
$ pip install bottleneck[test]
Or, if working locally:
$ pip install .[test]
After you have installed Bottleneck, run the suite of unit tests:
In [1]: import bottleneck as bn In [2]: bn.test() ============================= test session starts ============================= platform linux -- Python 3.7.4, pytest-4.3.1, py-1.8.0, pluggy-0.12.0 hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/home/chris/code/bottleneck/.hypothesis/examples') rootdir: /home/chris/code/bottleneck, inifile: setup.cfg plugins: openfiles-0.3.2, remotedata-0.3.2, doctestplus-0.3.0, mock-1.10.4, forked-1.0.2, cov-2.7.1, hypothesis-4.32.2, xdist-1.26.1, arraydiff-0.3 collected 190 items bottleneck/tests/input_modification_test.py ........................... [ 14%] .. [ 15%] bottleneck/tests/list_input_test.py ............................. [ 30%] bottleneck/tests/move_test.py ................................. [ 47%] bottleneck/tests/nonreduce_axis_test.py .................... [ 58%] bottleneck/tests/nonreduce_test.py .......... [ 63%] bottleneck/tests/reduce_test.py ....................................... [ 84%] ............ [ 90%] bottleneck/tests/scalar_input_test.py .................. [100%] ========================= 190 passed in 46.42 seconds ========================= Out[2]: True
If developing in the git repo, simply run py.test