pandas: powerful Python data analysis toolkit

Overview


pandas: powerful Python data analysis toolkit

PyPI Latest Release Conda Latest Release DOI Package Status License Azure Build Status Coverage Downloads Gitter Powered by NumFOCUS Code style: black Imports: isort

What is it?

pandas is a Python package that provides fast, flexible, and expressive data structures designed to make working with "relational" or "labeled" data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python. Additionally, it has the broader goal of becoming the most powerful and flexible open source data analysis / manipulation tool available in any language. It is already well on its way towards this goal.

Main Features

Here are just a few of the things that pandas does well:

  • Easy handling of missing data (represented as NaN, NA, or NaT) in floating point as well as non-floating point data
  • Size mutability: columns can be inserted and deleted from DataFrame and higher dimensional objects
  • Automatic and explicit data alignment: objects can be explicitly aligned to a set of labels, or the user can simply ignore the labels and let Series, DataFrame, etc. automatically align the data for you in computations
  • Powerful, flexible group by functionality to perform split-apply-combine operations on data sets, for both aggregating and transforming data
  • Make it easy to convert ragged, differently-indexed data in other Python and NumPy data structures into DataFrame objects
  • Intelligent label-based slicing, fancy indexing, and subsetting of large data sets
  • Intuitive merging and joining data sets
  • Flexible reshaping and pivoting of data sets
  • Hierarchical labeling of axes (possible to have multiple labels per tick)
  • Robust IO tools for loading data from flat files (CSV and delimited), Excel files, databases, and saving/loading data from the ultrafast HDF5 format
  • Time series-specific functionality: date range generation and frequency conversion, moving window statistics, date shifting and lagging

Where to get it

The source code is currently hosted on GitHub at: https://github.com/pandas-dev/pandas

Binary installers for the latest released version are available at the Python Package Index (PyPI) and on Conda.

# conda
conda install pandas
# or PyPI
pip install pandas

Dependencies

See the full installation instructions for minimum supported versions of required, recommended and optional dependencies.

Installation from sources

To install pandas from source you need Cython in addition to the normal dependencies above. Cython can be installed from PyPI:

pip install cython

In the pandas directory (same one where you found this file after cloning the git repo), execute:

python setup.py install

or for installing in development mode:

python -m pip install -e . --no-build-isolation --no-use-pep517

If you have make, you can also use make develop to run the same command.

or alternatively

python setup.py develop

See the full instructions for installing from source.

License

BSD 3

Documentation

The official documentation is hosted on PyData.org: https://pandas.pydata.org/pandas-docs/stable

Background

Work on pandas started at AQR (a quantitative hedge fund) in 2008 and has been under active development since then.

Getting Help

For usage questions, the best place to go to is StackOverflow. Further, general questions and discussions can also take place on the pydata mailing list.

Discussion and Development

Most development discussions take place on GitHub in this repo. Further, the pandas-dev mailing list can also be used for specialized discussions or design issues, and a Gitter channel is available for quick development related questions.

Contributing to pandas Open Source Helpers

All contributions, bug reports, bug fixes, documentation improvements, enhancements, and ideas are welcome.

A detailed overview on how to contribute can be found in the contributing guide. There is also an overview on GitHub.

If you are simply looking to start working with the pandas codebase, navigate to the GitHub "issues" tab and start looking through interesting issues. There are a number of issues listed under Docs and good first issue where you could start out.

You can also triage issues which may include reproducing bug reports, or asking for vital information such as version numbers or reproduction instructions. If you would like to start triaging issues, one easy way to get started is to subscribe to pandas on CodeTriage.

Or maybe through using pandas you have an idea of your own or are looking for something in the documentation and thinking ‘this can be improved’...you can do something about it!

Feel free to ask questions on the mailing list or on Gitter.

As contributors and maintainers to this project, you are expected to abide by pandas' code of conduct. More information can be found at: Contributor Code of Conduct

Comments
  • DOC: fix code in groupby documentation

    DOC: fix code in groupby documentation

    • ~~closes #xxxx (Replace xxxx with the GitHub issue number)~~
    • ~~Tests added and passed if fixing a bug or adding a new feature~~
    • [X] All code checks passed.
    • ~~Added type annotations to new arguments/methods/functions.~~
    • ~~Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.~~

    The missing new line in the second example prevents the execution of the last line.

    opened by abonte 0
  • BUG: read_parquet fails for hdfs:// files with latest fsspec

    BUG: read_parquet fails for hdfs:// files with latest fsspec

    Pandas version checks

    • [X] I have checked that this issue has not already been reported.

    • [X] I have confirmed this bug exists on the latest version of pandas.

    • [ ] I have confirmed this bug exists on the main branch of pandas.

    Reproducible Example

    # fsspec==2022.8.2
    df = pd.read_parquet("hdfs:///path/to/myfile.parquet") #works
    # fsspec==2022.11.0
    df = pd.read_parquet("hdfs:///path/to/myfile.parquet") #errors
    # OSError: only valid on seekable files
    

    Issue Description

    fsspec has changed the backend for hdfs to use the new filesystem in pyarrow in 2022.10.0. This seems to break compatibility with pandas as this apparently gives back a non seekable file now which pandas expects.

    One solution could be to have pandas require fsspec<=2022.8.2 which is the last version which worked.

    Another option would be to look upstream to fsspec and have them guarantee a seekable filehandle.

    A third would be to modify the pandas reader to detect a non seekable filehandle and buffer the file.

    Expected Behavior

    read_parquet should continue to work with hdfs remote files as it did with earlier versions of the fsspec dependency

    Installed Versions

    INSTALLED VERSIONS

    commit : 8dab54d6573f7186ff0c3b6364d5e4dd635ff3e7 python : 3.8.13.final.0 python-bits : 64 OS : Linux OS-release : 5.4.0-77-generic Version : #86~18.04.1-Ubuntu SMP Fri Jun 18 01:23:22 UTC 2021 machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : None LOCALE : None.None

    pandas : 1.5.2 numpy : 1.24.1 pytz : 2022.7 dateutil : 2.8.2 setuptools : 51.3.3 pip : 20.3.4 Cython : None pytest : None hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : None pymysql : None psycopg2 : None jinja2 : None IPython : 7.26.0 pandas_datareader: None bs4 : None bottleneck : None brotli : None fastparquet : None fsspec : 2022.11.0 gcsfs : None matplotlib : None numba : None numexpr : None odfpy : None openpyxl : None pandas_gbq : None pyarrow : 10.0.1 pyreadstat : None pyxlsb : None s3fs : None scipy : None snappy : None sqlalchemy : None tables : None tabulate : None xarray : None xlrd : None xlwt : None zstandard : None tzdata : None

    Bug Needs Triage 
    opened by f4hy 1
  • DEPR: Add FutureWarning for pandas.io.sql.execute

    DEPR: Add FutureWarning for pandas.io.sql.execute

    • [x] closes #50185
    • [x] Tests added and passed if fixing a bug or adding a new feature
    • [x] All code checks passed.
    • [ ] Added type annotations to new arguments/methods/functions.
    • [x] Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.
    opened by luke396 1
  • TST: Get tests to run and fix them to pass

    TST: Get tests to run and fix them to pass

    NOTE: test_metadata_propagation is still not fixed yet in this draft pull request

    Changed the class name from Generic to TestGeneric in order to get the test to run and then fixed five groups of tests (test_rename, test_get_numeric_data, test_frame_or_series_compound_dtypes, test_metadata_propagation, test_api_compat) in order to make sure that all of the tests pass.

    opened by phershbe 1
  • BUG: groupby with empty object, categorical grouper, and dropna=False fails

    BUG: groupby with empty object, categorical grouper, and dropna=False fails

    • [x] closes #50634 (Replace xxxx with the GitHub issue number)
    • [x] Tests added and passed if fixing a bug or adding a new feature
    • [x] All code checks passed.
    • [x] Added type annotations to new arguments/methods/functions.
    • [x] Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.
    Bug Groupby Missing-data Categorical 
    opened by rhshadrach 1
  • BUG: groupby with empty object, categorical grouper, and dropna=False fails

    BUG: groupby with empty object, categorical grouper, and dropna=False fails

    df = DataFrame({'a': [1, 1, 2], 'b': [3, 4, 5]})
    df['a'] = df['a'].astype('category')
    df = df.iloc[:0]
    gb = df.groupby('a', dropna=False, observed=True)
    print(gb.sum())
    

    gives ValueError: attempt to get argmax of an empty sequence

    Bug Groupby Missing-data Categorical 
    opened by rhshadrach 0
Releases(v1.5.2)
  • v1.5.2(Nov 22, 2022)

    This is a patch release in the 1.5.x series and includes some regression and bug fixes. We recommend that all users upgrade to this version.

    See the full whatsnew for a list of all the changes.

    The release will be available on the defaults and conda-forge channels:

    conda install pandas
    

    Or via PyPI:

    python3 -m pip install --upgrade pandas
    

    Please report any issues with the release on the pandas issue tracker.

    Thanks to all the contributors who made this release possible.

    Source code(tar.gz)
    Source code(zip)
    pandas-1.5.2.tar.gz(4.96 MB)
  • v1.5.1(Oct 19, 2022)

    This is a patch release in the 1.5.x series and includes some regression and bug fixes. We recommend that all users upgrade to this version.

    See the full whatsnew for a list of all the changes.

    The release will be available on the defaults and conda-forge channels:

    conda install pandas
    

    Or via PyPI:

    python3 -m pip install --upgrade pandas
    

    Please report any issues with the release on the pandas issue tracker.

    Thanks to all the contributors who made this release possible.

    Source code(tar.gz)
    Source code(zip)
    pandas-1.5.1.tar.gz(4.95 MB)
  • v1.5.0(Sep 19, 2022)

    This release includes some new features, bug fixes, and performance improvements. We recommend that all users upgrade to this version.

    See the full whatsnew for a list of all the changes. pandas 1.5.0 supports Python 3.8 and higher.

    The release will be available on the defaults and conda-forge channels:

    conda install -c conda-forge pandas

    Or via PyPI:

    python3 -m pip install --upgrade pandas

    Please report any issues with the release on the pandas issue tracker.

    Source code(tar.gz)
    Source code(zip)
    pandas-1.5.0.tar.gz(4.95 MB)
  • v1.4.4(Aug 31, 2022)

    This is a patch release in the 1.4.x series and includes some regression and bug fixes. We recommend that all users upgrade to this version.

    See the full whatsnew for a list of all the changes.

    The release will be available on the defaults and conda-forge channels:

    conda install pandas
    

    Or via PyPI:

    python3 -m pip install --upgrade pandas
    

    Please report any issues with the release on the pandas issue tracker.

    Thanks to all the contributors who made this release possible.

    Source code(tar.gz)
    Source code(zip)
    pandas-1.4.4.tar.gz(4.72 MB)
  • v1.5.0rc0(Aug 24, 2022)

    We are pleased to announce a release candidate for pandas 1.5.0. If all goes well, we'll release pandas 1.5.0 in about two weeks.

    See the whatsnew for a list of all the changes.

    The release will be available on conda-forge and PyPI.

    The release can be installed from PyPI

    python -m pip install --upgrade --pre pandas==1.5.0rc0
    

    Or from conda-forge

    conda install -c conda-forge/label/pandas_rc pandas==1.5.0rc0
    

    Please report any issues with the release candidate on the pandas issue tracker.

    Source code(tar.gz)
    Source code(zip)
    pandas-1.5.0rc0.tar.gz(4.94 MB)
  • v1.4.3(Jun 23, 2022)

  • v1.4.2(Apr 2, 2022)

  • v1.4.1(Feb 12, 2022)

    This is the first patch release in the 1.4.x series and includes some regression fixes and bug fixes. We recommend that all users upgrade to this version.

    See the full whatsnew for a list of all the changes.

    The release will be available on the defaults and conda-forge channels:

    conda install pandas
    

    Or via PyPI:

    python3 -m pip install --upgrade pandas
    

    Please report any issues with the release on the pandas issue tracker.

    Source code(tar.gz)
    Source code(zip)
    pandas-1.4.1.tar.gz(4.71 MB)
  • v1.4.0(Jan 22, 2022)

    This release includes some new features, bug fixes, and performance improvements. We recommend that all users upgrade to this version.

    See the full whatsnew for a list of all the changes. pandas 1.4.0 supports Python 3.8 and higher.

    The release will be available on the defaults and conda-forge channels:

    conda install -c conda-forge pandas
    

    Or via PyPI:

    python3 -m pip install --upgrade pandas
    

    Please report any issues with the release on the pandas issue tracker.

    Source code(tar.gz)
    Source code(zip)
    pandas-1.4.0.tar.gz(4.70 MB)
  • v1.4.0rc0(Jan 6, 2022)

    We are pleased to announce a release candidate for pandas 1.4.0. If all goes well, we'll release pandas 1.4.0 in about two weeks.

    See the whatsnew for a list of all the changes. pandas 1.4.0 supports Python 3.8 and higher.

    The release will be available on conda-forge and PyPI.

    The release can be installed from PyPI

    python -m pip install --upgrade --pre pandas==1.4.0rc0
    

    Or from conda-forge

    conda install -c conda-forge/label/pandas_rc pandas==1.4.0rc0
    

    Please report any issues with the release candidate on the pandas issue tracker.

    Source code(tar.gz)
    Source code(zip)
    pandas-1.4.0rc0.tar.gz(4.69 MB)
  • v1.3.5(Dec 12, 2021)

  • v1.3.4(Oct 17, 2021)

    This is a patch release in the 1.3.x series and includes some regression fixes and bug fixes. We recommend that all users upgrade to this version.

    See the full whatsnew for a list of all the changes.

    The release will be available on the defaults and conda-forge channels:

    conda install pandas
    

    Or via PyPI:

    python3 -m pip install --upgrade pandas
    

    Please report any issues with the release on the pandas issue tracker.

    Source code(tar.gz)
    Source code(zip)
    pandas-1.3.4.tar.gz(4.51 MB)
  • v1.3.3(Sep 12, 2021)

    This is a patch release in the 1.3.x series and includes some regression fixes and bug fixes. We recommend that all users upgrade to this version.

    See the full whatsnew for a list of all the changes.

    The release will be available on the defaults and conda-forge channels:

    conda install pandas
    

    Or via PyPI:

    python3 -m pip install --upgrade pandas
    

    Please report any issues with the release on the pandas issue tracker.

    Source code(tar.gz)
    Source code(zip)
    pandas-1.3.3.tar.gz(4.51 MB)
  • v1.3.2(Aug 15, 2021)

    This is a patch release in the 1.3.x series and includes some regression fixes and bug fixes. We recommend that all users upgrade to this version.

    See the full whatsnew for a list of all the changes.

    The release will be available on the defaults and conda-forge channels:

    conda install pandas
    

    Or via PyPI:

    python3 -m pip install --upgrade pandas
    

    Please report any issues with the release on the pandas issue tracker.

    Source code(tar.gz)
    Source code(zip)
    pandas-1.3.2.tar.gz(4.50 MB)
  • v1.3.1(Jul 25, 2021)

    This is the first patch release in the 1.3.x series and includes some regression fixes and bug fixes. We recommend that all users upgrade to this version.

    See the full whatsnew for a list of all the changes.

    The release will be available on the defaults and conda-forge channels:

    conda install pandas
    

    Or via PyPI:

    python3 -m pip install --upgrade pandas
    

    Please report any issues with the release on the pandas issue tracker.

    Source code(tar.gz)
    Source code(zip)
    pandas-1.3.1.tar.gz(4.50 MB)
  • v1.3.0(Jul 2, 2021)

    This release includes some new features, bug fixes, and performance improvements. We recommend that all users upgrade to this version.

    See the full whatsnew for a list of all the changes.

    The release will be available on the defaults and conda-forge channels:

    conda install -c conda-forge pandas
    

    Or via PyPI:

    python3 -m pip install --upgrade pandas
    

    Please report any issues with the release on the pandas issue tracker.

    Source code(tar.gz)
    Source code(zip)
    pandas-1.3.0.tar.gz(4.50 MB)
  • v1.2.5(Jun 22, 2021)

  • v1.3.0rc1(Jun 13, 2021)

    We are pleased to announce a release candidate for pandas 1.3.0. If all goes well, we'll release pandas 1.3.0 in about two weeks.

    See the whatsnew for a list of all the changes.

    The release will be available on conda-forge and PyPI.

    The release can be installed from PyPI

    python -m pip install --upgrade --pre pandas==1.3.0rc1
    

    Or from conda-forge

    conda install -c conda-forge/label/pandas_rc pandas==1.3.0rc1
    

    Please report any issues with the release candidate on the pandas issue tracker.

    Source code(tar.gz)
    Source code(zip)
    pandas-1.3.0rc1.tar.gz(4.48 MB)
  • v1.2.4(Apr 12, 2021)

  • v1.2.3(Mar 2, 2021)

  • v1.2.2(Feb 9, 2021)

    This is a patch release in the 1.2.x series and includes some regression fixes and bug fixes. We recommend that all users upgrade to this version.

    See the full whatsnew for a list of all the changes.

    The release will be available on the defaults and conda-forge channels:

    conda install pandas
    

    Or via PyPI:

    python3 -m pip install --upgrade pandas
    

    Please report any issues with the release on the pandas issue tracker.

    Source code(tar.gz)
    Source code(zip)
    pandas-1.2.2.tar.gz(5.21 MB)
  • v1.2.1(Jan 20, 2021)

    This is the first patch release in the 1.2.x series and includes some regression fixes and bug fixes. We recommend that all users upgrade to this version.

    See the full whatsnew for a list of all the changes.

    The release will be available on the defaults and conda-forge channels:

    conda install pandas
    

    Or via PyPI:

    python3 -m pip install --upgrade pandas
    

    Please report any issues with the release on the pandas issue tracker.

    Source code(tar.gz)
    Source code(zip)
    pandas-1.2.1.tar.gz(5.20 MB)
  • v1.2.0(Dec 26, 2020)

    This release includes some new features, bug fixes, and performance improvements. We recommend that all users upgrade to this version.

    See the full whatsnew for a list of all the changes.

    The release will be available on the defaults and conda-forge channels:

    conda install -c conda-forge pandas
    

    Or via PyPI:

    python3 -m pip install --upgrade pandas
    

    Please report any issues with the release on the pandas issue tracker.

    Source code(tar.gz)
    Source code(zip)
    pandas-1.2.0.tar.gz(5.14 MB)
  • v1.2.0rc0(Dec 8, 2020)

    This is the first release candidate for 1.2.0rc0. If all goes well, we'll release pandas 1.2.0 in about two weeks.

    See the whatsnew for a list of all the changes.

    The release can be installed from PyPI

    python -m pip install --upgrade --pre pandas==1.2.0rc0
    

    Or from conda-forge

    conda install -c conda-forge/label/pandas_rc pandas==1.2.0rc0
    

    Please report any issues with the release candidate on the pandas issue tracker.

    Source code(tar.gz)
    Source code(zip)
    pandas-1.2.0rc0.tar.gz(5.13 MB)
  • v1.1.5(Dec 7, 2020)

    This is a minor bug-fix release in the 1.1.x series and includes some regression fixes and bug fixes. We recommend that all users upgrade to this version.

    See the full whatsnew for a list of all the changes.

    The release will be available on the defaults and conda-forge channels:

    conda install pandas
    

    Or via PyPI:

    python3 -m pip install --upgrade pandas
    

    Please report any issues with the release on the pandas issue tracker.

    Source code(tar.gz)
    Source code(zip)
    pandas-1.1.5.tar.gz(4.98 MB)
  • v1.1.4(Oct 30, 2020)

    This is a minor bug-fix release in the 1.1.x series and includes some regression fixes and bug fixes. We recommend that all users upgrade to this version.

    See the full whatsnew for a list of all the changes.

    The release will be available on the defaults and conda-forge channels:

    conda install pandas
    

    Or via PyPI:

    python3 -m pip install --upgrade pandas
    

    Please report any issues with the release on the pandas issue tracker.

    Source code(tar.gz)
    Source code(zip)
    pandas-1.1.4.tar.gz(4.98 MB)
  • v1.1.3(Oct 5, 2020)

    This is a minor bug-fix release in the 1.1.x series and includes some regression fixes and bug fixes. We recommend that all users upgrade to this version.

    See the full whatsnew for a list of all the changes.

    The release will be available on the defaults and conda-forge channels:

    conda install pandas
    

    Or via PyPI:

    python3 -m pip install --upgrade pandas
    

    Please report any issues with the release on the pandas issue tracker.

    Source code(tar.gz)
    Source code(zip)
    pandas-1.1.3.tar.gz(4.98 MB)
  • v1.1.2(Sep 8, 2020)

    This is a minor bug-fix release in the 1.1.x series and includes some regression fixes and bug fixes. We recommend that all users upgrade to this version.

    See the full whatsnew for a list of all the changes.

    The release will be available on the defaults and conda-forge channels:

    conda install pandas
    

    Or via PyPI:

    python3 -m pip install --upgrade pandas
    

    Please report any issues with the release on the pandas issue tracker.

    Source code(tar.gz)
    Source code(zip)
    pandas-1.1.2.tar.gz(4.97 MB)
  • v1.1.1(Aug 20, 2020)

    This is a minor bug-fix release in the 1.1.x series and includes some regression fixes and bug fixes. We recommend that all users upgrade to this version.

    See the full whatsnew for a list of all the changes.

    The release will be available on the defaults and conda-forge channels:

    conda install pandas
    

    Or via PyPI:

    python3 -m pip install --upgrade pandas
    

    Please report any issues with the release on the pandas issue tracker.

    Source code(tar.gz)
    Source code(zip)
    pandas-1.1.1.tar.gz(4.97 MB)
  • v1.1.0(Jul 28, 2020)

    This is a minor release which includes some new features, bug fixes, and performance improvements. We recommend that all users upgrade to this version.

    See the whatsnew for a list of all the changes.

    The release can be installed from PyPI

    python -m pip install --upgrade pandas==1.1.0
    

    Or from conda-forge

    conda install -c conda-forge pandas==1.1.0
    

    Please report any issues with the release candidate on the pandas issue tracker.

    Source code(tar.gz)
    Source code(zip)
    pandas-1.1.0.tar.gz(4.96 MB)
Owner
pandas
Powerful data manipulation tools for Python
pandas
A data analysis using python and pandas to showcase trends in school performance.

A data analysis using python and pandas to showcase trends in school performance. A data analysis to showcase trends in school performance using Panda

Jimmy Faccioli 0 Sep 7, 2021
Tablexplore is an application for data analysis and plotting built in Python using the PySide2/Qt toolkit.

Tablexplore is an application for data analysis and plotting built in Python using the PySide2/Qt toolkit.

Damien Farrell 81 Dec 26, 2022
Powerful, efficient particle trajectory analysis in scientific Python.

freud Overview The freud Python library provides a simple, flexible, powerful set of tools for analyzing trajectories obtained from molecular dynamics

Glotzer Group 195 Dec 20, 2022
Using Python to scrape some basic player information from www.premierleague.com and then use Pandas to analyse said data.

PremiershipPlayerAnalysis Using Python to scrape some basic player information from www.premierleague.com and then use Pandas to analyse said data. No

null 5 Sep 6, 2021
Hatchet is a Python-based library that allows Pandas dataframes to be indexed by structured tree and graph data.

Hatchet Hatchet is a Python-based library that allows Pandas dataframes to be indexed by structured tree and graph data. It is intended for analyzing

Lawrence Livermore National Laboratory 14 Aug 19, 2022
NumPy and Pandas interface to Big Data

Blaze translates a subset of modified NumPy and Pandas-like syntax to databases and other computing systems. Blaze allows Python users a familiar inte

Blaze 3.1k Jan 5, 2023
A Pythonic introduction to methods for scaling your data science and machine learning work to larger datasets and larger models, using the tools and APIs you know and love from the PyData stack (such as numpy, pandas, and scikit-learn).

This tutorial's purpose is to introduce Pythonistas to methods for scaling their data science and machine learning work to larger datasets and larger models, using the tools and APIs they know and love from the PyData stack (such as numpy, pandas, and scikit-learn).

Coiled 102 Nov 10, 2022
Finds, downloads, parses, and standardizes public bikeshare data into a standard pandas dataframe format

Finds, downloads, parses, and standardizes public bikeshare data into a standard pandas dataframe format.

Brady Law 2 Dec 1, 2021
Statistical Analysis 📈 focused on statistical analysis and exploration used on various data sets for personal and professional projects.

Statistical Analysis ?? This repository focuses on statistical analysis and the exploration used on various data sets for personal and professional pr

Andy Pham 1 Sep 3, 2022
4CAT: Capture and Analysis Toolkit

4CAT: Capture and Analysis Toolkit 4CAT is a research tool that can be used to analyse and process data from online social platforms. Its goal is to m

Digital Methods Initiative 147 Dec 20, 2022
Numerical Analysis toolkit centred around PDEs, for demonstration and understanding purposes not production

Numerics Numerical Analysis toolkit centred around PDEs, for demonstration and understanding purposes not production Use procedure: Initialise a new i

George Whittle 1 Nov 13, 2021
Intercepting proxy + analysis toolkit for Second Life compatible virtual worlds

Hippolyzer Hippolyzer is a revival of Linden Lab's PyOGP library targeting modern Python 3, with a focus on debugging issues in Second Life-compatible

Salad Dais 6 Sep 1, 2022
Statistical package in Python based on Pandas

Pingouin is an open-source statistical package written in Python 3 and based mostly on Pandas and NumPy. Some of its main features are listed below. F

Raphael Vallat 1.2k Dec 31, 2022
Projeto para realizar o RPA Challenge . Utilizando Python e as bibliotecas Selenium e Pandas.

RPA Challenge in Python Projeto para realizar o RPA Challenge (www.rpachallenge.com), utilizando Python. O objetivo deste desafio é criar um fluxo de

Henrique A. Lourenço 1 Apr 12, 2022
Calculate multilateral price indices in Python (with Pandas and PySpark).

IndexNumCalc Calculate multilateral price indices using the GEKS-T (CCDI), Time Product Dummy (TPD), Time Dummy Hedonic (TDH), Geary-Khamis (GK) metho

Dr. Usman Kayani 3 Apr 27, 2022
Python utility to extract differences between two pandas dataframes.

Python utility to extract differences between two pandas dataframes.

Jaime Valero 8 Jan 7, 2023
Python data processing, analysis, visualization, and data operations

Python This is a Python data processing, analysis, visualization and data operations of the source code warehouse, book ISBN: 9787115527592 Descriptio

FangWei 1 Jan 16, 2022
Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).

AWS Data Wrangler Pandas on AWS Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretMana

Amazon Web Services - Labs 3.3k Jan 4, 2023
An extension to pandas dataframes describe function.

pandas_summary An extension to pandas dataframes describe function. The module contains DataFrameSummary object that extend describe() with: propertie

Mourad 450 Dec 30, 2022