Bias and Fairness Audit Toolkit

Overview

The Bias and Fairness Audit Toolkit

Aequitas is an open-source bias audit toolkit for data scientists, machine learning researchers, and policymakers to audit machine learning models for discrimination and bias, and to make informed and equitable decisions around developing and deploying predictive tools.

Visit the Aequitas project website

Try out the Aequitas web application

Try out our interact colab notebook using the COMPAS dataset.

Documentation

You can find the toolkit documentation here.

For usage examples of the python library, see our demo notebook from the KDD 2020 hands-on tutorial. Alternatively, have a look to COMPAS notebook using Aequitas on the ProPublica COMPAS Recidivism Risk Assessment dataset.

Installation

Aequitas is compatible with: Python 3.6+

Install Aequitas using pip:

pip install aequitas

If pip fails, try installing master from source:

git clone https://github.com/dssg/aequitas.git
cd aequitas
python setup.py install

(Note: be mindful of the python version you use to run setup.py)

You may then import the aequitas module from Python:

import aequitas

...or execute the auditor from the command line:

aequitas-report

...or launch the Web front-end from the command line (localhost):

python -m serve

Containerization

To build a Docker container of Aequitas:

docker build -t aequitas .

...or simply via manage:

manage container build

The Docker image's container defaults to launching the development Web server, though this can be overridden via the Docker "command" and/or "entrypoint".

To run such a container, supporting the Web server, on-the-fly:

docker run -p 5000:5000 -e "HOST=0.0.0.0" aequitas

...or, manage a development container via manage:

manage container [create|start|stop]

To contact the team, please email us at [aequitas at uchicago dot edu]

Aequitas Group Metrics

Below are descriptions of the absolute bias metrics calculated by Aequitas.

Metric Formula Description
Predicted Positive The number of entities within a group where the decision is positive, i.e.,
Total Predictive Positive The total number of entities predicted positive across groups defined by
Predicted Negative The number of entities within a group which decision is negative, i.e.,
Predicted Prevalence The fraction of entities within a group which were predicted as positive.
Predicted Positive Rate The fraction of the entities predicted as positive that belong to a certain group.
False Positive The number of entities of the group with and
False Negative The number of entities of the group with and
True Positive The number of entities of the group with and
True Negative The number of entities of the group with and
False Discovery Rate The fraction of false positives of a group within the predicted positive of the group.
False Omission Rate The fraction of false negatives of a group within the predicted negative of the group.
False Positive Rate The fraction of false positives of a group within the labeled negative of the group.
False Negative Rate The fraction of false negatives of a group within the labeled positives of the group.

Each bias disparity for a given group is calculated as follows:

30 Seconds to Aequitas

Python API

Detailed instructions are here.

To get started, preprocess your input data. Input data has slightly different requirements depending on whether you are using Aequitas via the webapp, CLI or Python package. See general input requirements and specific requirements for the web app, CLI, and Python API in the section immediately below.

If you plan to bin or discretize continuous features manually, note that get_crosstabs() expects attribute columns to be of type 'string,' so don't forget to recast any 'categorical' type columns!

    from aequitas.preprocessing import preprocess_input_df
    
    # double-check that categorical columns are of type 'string'
    df['categorical_column_name'] = df['categorical_column_name'].astype(str)
    
    df, _ = preprocess_input_df(*input_data*)

The Aequitas Group() class creates a crosstab of your preprocessed data, calculating absolute group metrics from score and label value truth status (true/ false positives and true/ false negatives)

    from aequitas.group import Group
    
    g = Group()
    xtab, _ = g.get_crosstabs(df)

The Plot() class can visualize a single group metric with plot_group_metric(), or a list of bias metrics with plot_group_metric_all(). Suppose you are interested in False Positive Rate across groups. We can visualize this metric in Aequitas:

    from aequitas.plotting import Plot
    
    aqp = Plot()
    fpr_plot = aqp.plot_group_metric(xtab, 'fpr')

There are some very small groups in this data set, for example 18 and 32 samples in the Native American and Asian population groups, respectively.

Aequitas includes an option to filter out groups under a minimum group size threshold, as very small group size may be a contributing factor in model error rates:

    from aequitas.plotting import Plot
    
    aqp = Plot()
    fpr_plot = aqp.plot_group_metric(xtab, 'fpr', min_group_size=0.05)

The crosstab dataframe is augmented by every succeeding class with additional layers of information about biases, starting with bias disparities in the Bias() class. There are three get_disparity functions, one for each of the three ways to select a reference group. get_disparity_min_metric() and get_disparity_major_group() methods calculate a reference group automatically based on your data, while the user specifies reference groups for get_disparity_predefined_groups().

    from aequitas.bias import Bias
    
    b = Bias()
    bdf = b.get_disparity_predefined_groups(xtab, 
                        original_df=df, 
                        ref_groups_dict={'race':'Caucasian', 'sex':'Male', 'age_cat':'25 - 45'}, 
                        alpha=0.05, 
                        check_significance=False)

Learn more about reference group selection.

The Plot() class visualizes disparities as treemaps colored based on disparity relationship between a given group and the reference group with plot_disparity() or multiple with plot_disparity_all(). Saturation is determined by a given fairness threshold.

Let's look at False Positive Rate Disparity.

    fpr_disparity = aqp.plot_disparity(bdf, group_metric='fpr_disparity', 
                                       attribute_name='race')

Now you're ready to obtain metric parities with the Fairness() class:

    from aequitas.fairness import Fairness
    
    f = Fairness()
    fdf = f.get_group_value_fairness(bdf)

You now have parity determinations for your models that can be leveraged in model selection! If a specific bias metric for a group falls within a given percentage (based on the fairness threshold) of the reference group, the fairness determination is 'True.'

To determine whether group False Positive Rates fall within the "fair" range, use Plot() class fairness methods:

    fpr_fairness = aqp.plot_fairness_group(fdf, group_metric='fpr', title=True)

To quickly review False Positive Rate Disparity fairness determinations, we can use Plot() class fairness_disparity() methods:

    fpr_disparity_fairness = aqp.plot_fairness_disparity(fdf, group_metric='fpr', attribute_name='race')

Input Data

In general, input data is a single table with the following columns:

  • score
  • label_value (for error-based metrics only)
  • at least one attribute e.g. race, sex and age_cat (attribute categories defined by user)
score label_value race sex age income
0 1 African-American Female 27 18000
1 1 Caucasian Male 32

Back to 30 Seconds to Aequitas

Input data for Webapp

The webapp requires a single CSV with columns for a binary score, a binary label_value and an arbitrary number of attribute columns. Each row is associated with a single observation.

score

Aequitas webapp assumes the score column is a binary decision (0 or 1).

label_value

This is the ground truth value of a binary decision. The data again must be binary 0 or 1.

attributes (e.g. race, sex, age, income)

Group columns can be categorical or continuous. If categorical, Aequitas will produce crosstabs with bias metrics for each group_level. If continuous, Aequitas will first bin the data into quartiles and then create crosstabs with the newly defined categories.

Back to 30 Seconds to Aequitas

Input data for CLI

The CLI accepts CSV files and accommodates database calls defined in Configuration files.

score

By default, Aequitas CLI assumes the score column is a binary decision (0 or 1). Alternatively, the score column can contain the score (e.g. the output from a logistic regression applied to the data). In this case, the user sets a threshold to determine the binary decision. See configurations for more on thresholds.

label_value

As with the webapp, this is the ground truth value of a binary decision. The data must be binary 0 or 1.

attributes (e.g. race, sex, age, income)

Group columns can be categorical or continuous. If categorical, Aequitas will produce crosstabs with bias metrics for each group value. If continuous, Aequitas will first bin the data into quartiles.

model_id

model_id is an identifier tied to the output of a specific model. With a model_id column you can test the bias of multiple models at once. This feature is available using the CLI or the Python package.

Reserved column names:
  • id
  • model_id
  • entity_id
  • rank_abs
  • rank_pct

Back to 30 Seconds to Aequitas

Input data for Python API

Python input data can be handled identically to CLI by using preprocess_input_df(). Otherwise, you must discretize continuous attribute columns prior to passing the data to Group().get_crosstabs().

    from Aequitas.preprocessing import preprocess_input_df()
    # *input_data* matches CLI input data norms.
    df, _ = preprocess_input_df(*input_data*)

score

By default, Aequitas assumes the score column is a binary decision (0 or 1). If the score column contains a non-binary score (e.g. the output from a logistic regression applied to the data), the user sets a threshold to determine the binary decision. Thresholds are set in a dictionary passed to get_crosstabs() of format {'rank_abs':[300] , 'rank_pct':[1.0, 5.0, 10.0]}. See configurations for more on thresholds.

label_value

This is the ground truth value of a binary decision. The data must be binary (0 or 1).

attributes (e.g. race, sex, age, income)

Group columns can be categorical or continuous. If categorical, Aequitas will produce crosstabs with bias metrics for each group_level. If continuous, Aequitas will first bin the data into quartiles.

If you plan to bin or discretize continuous features manually, note that get_crosstabs() expects attribute columns to be of type 'string'. This excludes the pandas 'categorical' data type, which is the default output of certain pandas discretizing functions. You can recast 'categorical' columns to strings:

   df['categorical_column_name'] = df['categorical_column_name'].astype(str)
model_id

model_id is an identifier tied to the output of a specific model. With a model_id column you can test the bias of multiple models at once. This feature is available using the CLI or the Python package.

Reserved column names:
  • id
  • model_id
  • entity_id
  • rank_abs
  • rank_pct

Back to 30 Seconds to Aequitas

Development

Provision your development environment via the shell script develop:

./develop

Common development tasks, such as deploying the webapp, may then be handled via manage:

manage --help

Citing Aequitas

If you use Aequitas in a scientific publication, we would appreciate citations to the following paper:

Pedro Saleiro, Benedict Kuester, Abby Stevens, Ari Anisfeld, Loren Hinkson, Jesse London, Rayid Ghani, Aequitas: A Bias and Fairness Audit Toolkit, arXiv preprint arXiv:1811.05577 (2018). (PDF)

   @article{2018aequitas,
     title={Aequitas: A Bias and Fairness Audit Toolkit},
     author={Saleiro, Pedro and Kuester, Benedict and Stevens, Abby and Anisfeld, Ari and Hinkson, Loren and London, Jesse and Ghani, Rayid}, journal={arXiv preprint arXiv:1811.05577}, year={2018}}
Comments
  • Adding a plotting function

    Adding a plotting function

    I fixed some issues that caused tests to fail (this are the things that have nothing to do with the plotting). I added a plotting function that plots the disparities themselves. This is the figure we had at the poster and NOT the comparison between models. I will also add the comparison between models.

    Check it out and be brutal with your comments - we want this to be perfect!

    opened by kalkairis 3
  • Signif subset

    Signif subset

    Adjusted plotting.py and bias.py to make check_signficance a boolean flag to run all significances, and selected_significance to allow user to limit check_significance flag to a subset of columns. Default is no significance calculations.

    Made a few tweaks to docstrings and list_disparities/ list_significance methods to be cleaner/ clearer.

    bug 
    opened by lorenh516 2
  • Documentation clarity

    Documentation clarity

    I'm working through using Aequitas for a bias audit and am noticing a few issues in the documentation:

    • The Python API preprocessing input statement is incorrect, it should be from Aequitas.preprocessing import preprocess_input_df, i.e. preprocess_input_df should not be callable

    • The documentation says that attribute columns can be categorical or continuous, this is confusing in terms of when cleaning functions are called in the pre-processing step. For instance I had made age categories using the cut function that returned a categorical datatype, which caused the discretize cleaning function to be called which then threw an error. It should be clear that categorical columns must be converted to strings.

    • Import statements are missing for the Plot and Bias modules; the docs should include from aequitas.bias import Bias and from aequitas.plotting import Plot in those example calls.

    • The call for the visualization of the disparity treemaps doesn't work as is, it should be j = p.plot_disparity_all(.... rather than j = aqp.plot_disparity_all(... to match the earlier use of Plot().

    opened by cherdeman 2
  • Add dockerfile and enable docker running in serve.py

    Add dockerfile and enable docker running in serve.py

    Hi, Is this of any interest or value? I have added a dockerfile, docker-compose.yml and updated serve.py so you can run aequitas in a docker container for local testing.

    You can run with docker-compose up

    and then access aequitas at localhost:5000

    I needed to slightly alter serve.py so it could run the flask app with host '0.0.0.0' when running in docker.

    Happy to make any tweaks / changes if needs be.

    opened by deparkes 2
  • Bug on min_metric disparities when there are no positive predictions

    Bug on min_metric disparities when there are no positive predictions

    Bug

    When there are multiple protected attributes, and you assess disparities with get_disparity_min_metric, aequitas enters the if on line 102 of bias.py and returns the wrong disparities matrix (as seen in the bottom figure).

    Steps to reproduce bug:

    import pandas as pd
    import aequitas
    import numpy as np
    
    # Fake DF to demonstrate bug
    n_samples = 1000
    
    aequitas_df = pd.DataFrame({
        'label_value': (np.random.random((n_samples,)) > 0.95).astype(int),
        'score': np.zeros((n_samples,)).astype(int),  # All negative predictions  # -> leads to bug
    #    'score': (np.random.random((n_samples,)) > 0.90).astype(int),  # -> no bug with these scores
        'gender': (np.random.random((n_samples,)) > 0.6).astype(str),
        'age_group': (4 * np.random.random((n_samples,))).astype(int).astype(str),
    })
    
    
    from aequitas.group import Group
    from aequitas.bias import Bias
    
    attr_cols = list(set(aequitas_df.columns) - {
        'entity_id', 'score', 'label_value', 'as_of_date'
    })
    
    # Initialize aequitas objects
    g = Group()
    b = Bias()
    
    # Get confusion matrix and metrics for each individual group and attribute
    confusion_matrix_metrics, _ = g.get_crosstabs(
        aequitas_df, attr_cols=attr_cols,
    )
    
    disparities_matrix = b.get_disparity_min_metric(
        confusion_matrix_metrics,
        original_df=aequitas_df,
        fill_divbyzero=1e3,
    )
    

    confusion_matrix_metrics image

    disparities_matrix image

    opened by AndreFCruz 1
  • Add Statistical Significance

    Add Statistical Significance

    Metric Statistical Significance [Resolves issue #51 ]

    src/aequitas/plotting.py: Adjust methods to visualize statistical significance of disparities

    • Added options to limit metrics _assemble_ref_groups() method
    • Added options to indicate statistical significance to tree maps in disparity visualization methods

    src/aequitas/bias.py: Implement methods to calculate statistical significance of disparities based on false positives, false negatives, binary label values, and scores of a given population in relation to a reference group

    • Added get_measure_sample(), check_equal_variance(), calculate_significance() , get_statistical_significance() methods to pull samples from original df and run statistical significance (t-test) between population groups and predefined or automatically determined reference groups for disparity metrics calculated in Bias() class
    enhancement 
    opened by lorenh516 1
  • made direnv part of the development environment

    made direnv part of the development environment

    Now:

    1. It's clearer how you can specify your AWS credentials (if necessary)
    2. With PATH MANIPULATION (woo!) we can keep dev/management Python library requirements entirely separate from project lib (aequitas) requirements, in a nested virtualenv, without further nasty workarounds ;)

    To explain (2) a bit more: now, not only awsebcli, but both/all the new libraries I've added get installed into their own virtualenv, so there's no chance of conflict with what aequitas needs. (And, this removes some hacks I had to use in the previous solution.)

    opened by jesteria 1
  • reorganized requirements management

    reorganized requirements management

    … to allow for conflicting requirements between project and awsebcli (tabulate).

    I don't love this solution, in so far as argcmdr must still be installed into the main virtualenv; but, argcmdr is intended to be light and flexible with requirements, whereas the awsebcli is certainly not. Rather, in the development environment, the management library argcmdr is installed along with aequitas and its requirements, but proxied management libraries like awsebcli are installed into a nested virtualenv. :smile_cat:

    opened by jesteria 1
  • deployment of webapp to elastic beanstalk

    deployment of webapp to elastic beanstalk

    Last round of tweaks for management of and deployment to an Elastic Beanstalk environment.

    Currently live at: http://aequitas-pro.3gnc2m2geg.us-west-2.elasticbeanstalk.com/

    (We can set a nicer CNAME as desired.)


    Now, having installed the development environment on your local machine via ./install, you can manage the project via manage.

    For example, (having torn down the environment), you can recreate it:

    manage web env create
    

    This will provision AWS with the appropriate security groups, yadda yadda yadda, and the standalone web server, under the default name, aequitas-pro. You can then deploy your branch's last commit under a new version name:

    manage web env deploy 0.3.0
    

    (Or, if this version has already been deployed, this command will re-deploy that version.)

    To get a nice heads-up view of the web environment, launch the AWS console:

    manage web env console
    
    opened by jesteria 1
  • pre-deploy clean-up

    pre-deploy clean-up

    Added some bootstrapping and installation support, and moved app and bin under src as their own Python packages.

    We might want to add to README that the package may be installed with:

    python setup.py install
    

    and/or named as a Python install requirement (e.g. in a requirements file).

    And that you can then execute the auditor with:

    aequitas-audit
    

    And we might add a separate section, expressly for developers, that you can set up your local environment by executing:

    ./install
    
    opened by jesteria 1
  • Refactor get crosstabs

    Refactor get crosstabs

    Change class Group to have gen_metrics_df, where the confusion matrix is obtained through grouping the variables at column level. Small QoL improvements on methods and change in docstrings to NumPy format.

    opened by sgpjesus 0
  • Adds list cast to inferred attribute columns

    Adds list cast to inferred attribute columns

    Fixes #120

    Adds a cast to the inferred attribute columns, which are of type pandas.core.indexes.base.Index. These are added in the line bellow, with a list comprised of [score_col, label_col]. The former class does not have a native implementation of sum with lists, but it can be casted to list (which can be added).

    opened by sgpjesus 0
  • [Bug] Pandas columns not converted to list in `get_crosstabs` method

    [Bug] Pandas columns not converted to list in `get_crosstabs` method

    This makes the method crash in l.319 of the aequitas/group.py file, when trying to add a list and a pandas index object. To fix this, we must cast the object to a python list.

    opened by sgpjesus 0
  • running aequitas without label_value

    running aequitas without label_value

    In case the ground truth for the model is not available it should still be possible to run aequitas to get the metrics that do not depend on this input. Right now the Group() class requires label_value as input but it's not needed for metrics such as predictive positive ratio. One can input label_value with eg. 0 or 1 but then some supervised metrics will still be calculated incorrectly based on this input which can be confusing.

    opened by OliwiaDetmers 0
  • issue regarding uploading the data set and run it on compas notebook

    issue regarding uploading the data set and run it on compas notebook

    I want to run a data set on Web Api of Aequitas but am not able to run it, after uploading it is showing a local error, please provide a solution regarding this with proper steps on how can we upload different data sets.

    opened by yajush1998 0
  • how to use web api on different data set

    how to use web api on different data set

    I want to run a data set on Web Api of Aequitas but am not able to run it, after uploading it is showing a local error, please provide a solution regarding this with proper steps on how can we upload different data sets.

    opened by yajush1998 0
FairML - is a python toolbox auditing the machine learning models for bias.

======== FairML: Auditing Black-Box Predictive Models FairML is a python toolbox auditing the machine learning models for bias. Description Predictive

Julius Adebayo 338 Nov 9, 2022
Neural network visualization toolkit for tf.keras

Neural network visualization toolkit for tf.keras

Yasuhiro Kubota 262 Dec 19, 2022
Visualization toolkit for neural networks in PyTorch! Demo -->

FlashTorch A Python visualization toolkit, built with PyTorch, for neural networks in PyTorch. Neural networks are often described as "black box". The

Misa Ogura 692 Dec 29, 2022
👋🦊 Xplique is a Python toolkit dedicated to explainability, currently based on Tensorflow.

???? Xplique is a Python toolkit dedicated to explainability, currently based on Tensorflow.

DEEL 343 Jan 2, 2023
Interpretability and explainability of data and machine learning models

AI Explainability 360 (v0.2.1) The AI Explainability 360 toolkit is an open-source library that supports interpretability and explainability of datase

null 1.2k Dec 29, 2022
Portal is the fastest way to load and visualize your deep neural networks on images and videos 🔮

Portal is the fastest way to load and visualize your deep neural networks on images and videos ??

Datature 243 Jan 5, 2023
Many Class Activation Map methods implemented in Pytorch for CNNs and Vision Transformers. Including Grad-CAM, Grad-CAM++, Score-CAM, Ablation-CAM and XGrad-CAM

Class Activation Map methods implemented in Pytorch pip install grad-cam ⭐ Comprehensive collection of Pixel Attribution methods for Computer Vision.

Jacob Gildenblat 6.5k Jan 1, 2023
Algorithms for monitoring and explaining machine learning models

Alibi is an open source Python library aimed at machine learning model inspection and interpretation. The focus of the library is to provide high-qual

Seldon 1.9k Dec 30, 2022
Visual analysis and diagnostic tools to facilitate machine learning model selection.

Yellowbrick Visual analysis and diagnostic tools to facilitate machine learning model selection. What is Yellowbrick? Yellowbrick is a suite of visual

District Data Labs 3.9k Dec 30, 2022
A library for debugging/inspecting machine learning classifiers and explaining their predictions

ELI5 ELI5 is a Python package which helps to debug machine learning classifiers and explain their predictions. It provides support for the following m

null 2.6k Dec 30, 2022
treeinterpreter - Interpreting scikit-learn's decision tree and random forest predictions.

TreeInterpreter Package for interpreting scikit-learn's decision tree and random forest predictions. Allows decomposing each prediction into bias and

Ando Saabas 720 Dec 22, 2022
A collection of infrastructure and tools for research in neural network interpretability.

Lucid Lucid is a collection of infrastructure and tools for research in neural network interpretability. We're not currently supporting tensorflow 2!

null 4.5k Jan 7, 2023
Visualizer for neural network, deep learning, and machine learning models

Netron is a viewer for neural network, deep learning and machine learning models. Netron supports ONNX (.onnx, .pb, .pbtxt), Keras (.h5, .keras), Tens

Lutz Roeder 20.9k Dec 28, 2022
tensorboard for pytorch (and chainer, mxnet, numpy, ...)

tensorboardX Write TensorBoard events with simple function call. The current release (v2.1) is tested on anaconda3, with PyTorch 1.5.1 / torchvision 0

Tzu-Wei Huang 7.5k Jan 7, 2023
TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, Korean, Chinese, German and Easy to adapt for other languages)

?? TensorFlowTTS provides real-time state-of-the-art speech synthesis architectures such as Tacotron-2, Melgan, Multiband-Melgan, FastSpeech, FastSpeech2 based-on TensorFlow 2. With Tensorflow 2, we can speed-up training/inference progress, optimizer further by using fake-quantize aware and pruning, make TTS models can be run faster than real-time and be able to deploy on mobile devices or embedded systems.

null 3k Jan 4, 2023
A collection of research papers and software related to explainability in graph machine learning.

A collection of research papers and software related to explainability in graph machine learning.

AstraZeneca 1.9k Dec 26, 2022
Quickly and easily create / train a custom DeepDream model

Dream-Creator This project aims to simplify the process of creating a custom DeepDream model by using pretrained GoogleNet models and custom image dat

null 56 Jan 3, 2023
Implementation of linear CorEx and temporal CorEx.

Correlation Explanation Methods Official implementation of linear correlation explanation (linear CorEx) and temporal correlation explanation (T-CorEx

Hrayr Harutyunyan 34 Nov 15, 2022
Visualizer for neural network, deep learning, and machine learning models

Netron is a viewer for neural network, deep learning and machine learning models. Netron supports ONNX, TensorFlow Lite, Keras, Caffe, Darknet, ncnn,

Lutz Roeder 16.3k Sep 27, 2021