ELI5 is a Python package which helps to debug machine learning classifiers and explain their predictions

Overview

ELI5

PyPI Version Build Status Code Coverage Documentation

ELI5 is a Python package which helps to debug machine learning classifiers and explain their predictions.

explain_prediction for text data

explain_prediction for image data

It provides support for the following machine learning frameworks and packages:

  • scikit-learn. Currently ELI5 allows to explain weights and predictions of scikit-learn linear classifiers and regressors, print decision trees as text or as SVG, show feature importances and explain predictions of decision trees and tree-based ensembles. ELI5 understands text processing utilities from scikit-learn and can highlight text data accordingly. Pipeline and FeatureUnion are supported. It also allows to debug scikit-learn pipelines which contain HashingVectorizer, by undoing hashing.
  • Keras - explain predictions of image classifiers via Grad-CAM visualizations.
  • xgboost - show feature importances and explain predictions of XGBClassifier, XGBRegressor and xgboost.Booster.
  • LightGBM - show feature importances and explain predictions of LGBMClassifier, LGBMRegressor and lightgbm.Booster.
  • CatBoost - show feature importances of CatBoostClassifier, CatBoostRegressor and catboost.CatBoost.
  • lightning - explain weights and predictions of lightning classifiers and regressors.
  • sklearn-crfsuite. ELI5 allows to check weights of sklearn_crfsuite.CRF models.

ELI5 also implements several algorithms for inspecting black-box models (see Inspecting Black-Box Estimators):

  • TextExplainer allows to explain predictions of any text classifier using LIME algorithm (Ribeiro et al., 2016). There are utilities for using LIME with non-text data and arbitrary black-box classifiers as well, but this feature is currently experimental.
  • Permutation importance method can be used to compute feature importances for black box estimators.

Explanation and formatting are separated; you can get text-based explanation to display in console, HTML version embeddable in an IPython notebook or web dashboards, a pandas.DataFrame object if you want to process results further, or JSON version which allows to implement custom rendering and formatting on a client.

License is MIT.

Check docs for more.

Note

This is the same project as https://github.com/TeamHG-Memex/eli5/, but due to temporary github access issues, 0.11 release is prepared in https://github.com/eli5-org/eli5 (this repo).


define hyperiongray
Comments
  • Keras image explainer not working

    Keras image explainer not working

    OS: macOS Monterey v12.2.1 Hardware: MacBook Pro (13-inch, M1, 2020) Chip: Apple M1 Python 3.9.10

    AttributeError Traceback (most recent call last) Input In [22], in 10 warnings.simplefilter("ignore") # disable Keras warnings for this tutorial 11 import keras ---> 14 import eli5

    File ~/miniforge3/envs/tf_gpu/lib/python3.9/site-packages/eli5/init.py:6, in 2 from future import absolute_import 4 version = '0.11.0' ----> 6 from .formatters import ( 7 format_as_html, 8 format_html_styles, 9 format_as_text, 10 format_as_dict, 11 ) 12 from .explain import explain_weights, explain_prediction 13 from .sklearn import explain_weights_sklearn, explain_prediction_sklearn

    File ~/miniforge3/envs/tf_gpu/lib/python3.9/site-packages/eli5/formatters/init.py:9, in 2 """ 3 Functions to convert explanations to human-digestible formats. 4 5 TODO: IPython integration, customizability. 6 """ 8 from .text import format_as_text ----> 9 from .html import format_as_html, format_html_styles 10 try: 11 from .as_dataframe import ( 12 explain_weights_df, explain_weights_dfs, 13 explain_prediction_df, explain_prediction_dfs, 14 format_as_dataframe, format_as_dataframes, 15 )

    File ~/miniforge3/envs/tf_gpu/lib/python3.9/site-packages/eli5/formatters/html.py:22, in 18 from .trees import tree2text 19 from .text_helpers import prepare_weighted_spans, PreparedWeightedSpans ---> 22 template_env = Environment( 23 loader=PackageLoader('eli5', 'templates'), 24 extensions=['jinja2.ext.with_']) 25 template_env.globals.update(dict(zip=zip, numpy=np)) 26 template_env.filters.update(dict( 27 weight_color=lambda w, w_range: format_hsl(weight_color_hsl(w, w_range)), 28 remaining_weight_color=lambda ws, w_range, pos_neg: (...) 33 format_decision_tree=lambda tree: _format_decision_tree(tree), 34 ))

    File ~/miniforge3/envs/tf_gpu/lib/python3.9/site-packages/jinja2/environment.py:363, in Environment.init(self, block_start_string, block_end_string, variable_start_string, variable_end_string, comment_start_string, comment_end_string, line_statement_prefix, line_comment_prefix, trim_blocks, lstrip_blocks, newline_sequence, keep_trailing_newline, extensions, optimized, undefined, finalize, autoescape, loader, cache_size, auto_reload, bytecode_cache, enable_async) 360 self.policies = DEFAULT_POLICIES.copy() 362 # load extensions --> 363 self.extensions = load_extensions(self, extensions) 365 self.is_async = enable_async 366 _environment_config_check(self)

    File ~/miniforge3/envs/tf_gpu/lib/python3.9/site-packages/jinja2/environment.py:117, in load_extensions(environment, extensions) 115 for extension in extensions: 116 if isinstance(extension, str): --> 117 extension = t.cast(t.Type["Extension"], import_string(extension)) 119 result[extension.identifier] = extension(environment) 121 return result

    File ~/miniforge3/envs/tf_gpu/lib/python3.9/site-packages/jinja2/utils.py:149, in import_string(import_name, silent) 147 else: 148 return import(import_name) --> 149 return getattr(import(module, None, None, [obj]), obj) 150 except (ImportError, AttributeError): 151 if not silent:

    AttributeError: module 'jinja2.ext' has no attribute 'with_'

    opened by mv96 2
  • fix: fixes import error when importing jinja2 > 3.0.3,

    fix: fixes import error when importing jinja2 > 3.0.3,

    Fix: fixes import error when importing jinja2 > 3.0.3, requirements & setup are also updated.

    Tested it with jinja2 = 3.0.0

    Also tested it with jinja2 < 3.0.0 (2.11.3), but this resulted in an error not related to the fix. So support for the lowest possible version of jinja2 is not impacted.

    opened by dvorst 0
  • Fix the build

    Fix the build

    • drop py36-legacy, as now py27 works well enough with most libraries already dropping support for it
    • restrict TF and Keras versions (we support only TF 1.x)
    • test skelarn-crfsuite only under python 2.7, as it does not work with newer sklearn
    opened by lopuhin 0
  • Error: estimator is not supported

    Error: estimator is not supported

    Hello community,

    I am using eli5 library for image explanations, I am referring this tutorial, but at the end I am facing estimator is not supported, Please help me to solve this error,

    any help is highly appreciated, @lopuhin

    Thanks!

    opened by IITGoaPyVidya 0
  • NotFittedErrror while using Non-CV Permutation Importance with RFECV

    NotFittedErrror while using Non-CV Permutation Importance with RFECV

    hey, i'm trying to pair RFECV with permutation Importance, but without cross-validation. I tried several approaches but I get always a NotFittedError for the estimator_func while trying to fit the rfecv. Am i missing something?

       ``` 
    
    estimator_funct = estimator_funct.fit(extract_relevant_features, choosen_target)
    
    pi = PermutationImportance(estimator_funct,  scoring='r2', n_iter=10, random_state=1).fit(extract_relevant_features, choosen_target)
    
    rfecv = RFECV(
        estimator=pi,
        step=1,
        cv=cv_func,
        scoring=score,
        min_features_to_select=min_features_to_select,
    )
    #
    rfecv.fit(extract_relevant_features,
              choosen_target,
              groups=extract_relevant_features.index) ```python
    

    When i pair RFECV with CV- PermutationImportance, everything works fine, but i would like to have the non-cv version.

    opened by enesok 0
  • Unable to import eli5

    Unable to import eli5

    I just pip installed eli5 0.13.0 on a new AzureML instance and it is throwing an exception during "import eli5"

    The sklearn.feature_selection.base module is deprecated in version 0.22 and will be removed in version 0.24. The corresponding classes / functions should instead be imported from sklearn.feature_selection. Anything that cannot be imported from sklearn.feature_selection is now part of the private API. Using TensorFlow backend.

    AttributeError Traceback (most recent call last) Input In [59], in <cell line: 1>() ----> 1 import eli5

    File /anaconda/envs/azureml_py38/lib/python3.8/site-packages/eli5/init.py:93, in 89 pass 92 try: ---> 93 from .keras import ( 94 explain_prediction_keras 95 ) 96 except ImportError: 97 # keras is not available 98 pass

    File /anaconda/envs/azureml_py38/lib/python3.8/site-packages/eli5/keras/init.py:3, in 1 # -- coding: utf-8 -- ----> 3 from .explain_prediction import explain_prediction_keras 4 from .gradcam import gradcam, gradcam_backend

    File /anaconda/envs/azureml_py38/lib/python3.8/site-packages/eli5/keras/explain_prediction.py:8, in 5 import PIL 7 import numpy as np ----> 8 import keras 9 import keras.backend as K 10 from keras.models import Model

    File /anaconda/envs/azureml_py38/lib/python3.8/site-packages/keras/init.py:25, in 22 from keras import distribute 24 # See b/110718070#comment18 for more details about this import. ---> 25 from keras import models 27 from keras.engine.input_layer import Input 28 from keras.engine.sequential import Sequential

    File /anaconda/envs/azureml_py38/lib/python3.8/site-packages/keras/models.py:19, in 16 """Code for model cloning, plus model-related API entries.""" 18 import tensorflow.compat.v2 as tf ---> 19 from keras import backend 20 from keras import metrics as metrics_module 21 from keras import optimizer_v1

    File /anaconda/envs/azureml_py38/lib/python3.8/site-packages/keras/backend/init.py:1, in ----> 1 from .load_backend import epsilon 2 from .load_backend import set_epsilon 3 from .load_backend import floatx

    File /anaconda/envs/azureml_py38/lib/python3.8/site-packages/keras/backend/load_backend.py:90, in 88 elif _BACKEND == 'tensorflow': 89 sys.stderr.write('Using TensorFlow backend.\n') ---> 90 from .tensorflow_backend import * 91 else: 92 # Try and load external backend. 93 try:

    File /anaconda/envs/azureml_py38/lib/python3.8/site-packages/keras/backend/tensorflow_backend.py:25, in 22 import numpy as np 23 from distutils.version import StrictVersion ---> 25 from ..utils.generic_utils import transpose_shape 27 py_all = all 28 py_any = any

    File /anaconda/envs/azureml_py38/lib/python3.8/site-packages/keras/utils/generic_utils.py:415, in 410 else: 411 return obj.name 414 @tf_contextlib.contextmanager --> 415 def skip_failed_serialization(): 416 global _SKIP_FAILED_SERIALIZATION 417 prev = _SKIP_FAILED_SERIALIZATION

    File /anaconda/envs/azureml_py38/lib/python3.8/site-packages/keras/utils/tf_contextlib.py:33, in contextmanager(target) 23 """A tf_decorator-aware wrapper for contextlib.contextmanager. 24 25 Usage is identical to contextlib.contextmanager. (...) 30 A callable that can be used inside of a with statement. 31 """ 32 context_manager = _contextlib.contextmanager(target) ---> 33 return tf.internal.decorator.make_decorator(target, context_manager, 'contextmanager')

    AttributeError: module 'tensorflow.compat.v2' has no attribute 'internal'

    opened by nkadochn 3
  • eli5.explain_prediction for multiple predictions

    eli5.explain_prediction for multiple predictions

    Hi folks,

    I just started looking into this really library. One thing that makes me curious is, why isn't there an api likeeli5.explain_prediction but for multiple predictions? It would be nice to be able to provide a slice of inference data and have the method tell you the overall contribution of of the features in the sample

    opened by PGryllos 0
  • Make Lime color less sensitive

    Make Lime color less sensitive

    Hi,

    I would like to make Lime colouring a bit less sensitive: highlight only words that have weights above a specific threshold.

    None of the arguments of show_prediction enables to fix this threshold.

    @kmike, @lopuhin @zzz4zzz. Is there a way?

    A possible solution could be to add more opacity to the words where weights are not too large.

    opened by jmsquare 0
Owner
null
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

Vowpal Wabbit 8.1k Dec 30, 2022
MLReef is an open source ML-Ops platform that helps you collaborate, reproduce and share your Machine Learning work with thousands of other users.

The collaboration platform for Machine Learning MLReef is an open source ML-Ops platform that helps you collaborate, reproduce and share your Machine

MLReef 1.4k Dec 27, 2022
Highly interpretable classifiers for scikit learn, producing easily understood decision rules instead of black box models

Highly interpretable, sklearn-compatible classifier based on decision rules This is a scikit-learn compatible wrapper for the Bayesian Rule List class

Tamas Madl 482 Nov 19, 2022
Evidently helps analyze machine learning models during validation or production monitoring

Evidently helps analyze machine learning models during validation or production monitoring. The tool generates interactive visual reports and JSON profiles from pandas DataFrame or csv files. Currently 6 reports are available.

Evidently AI 3.1k Jan 7, 2023
Data from "Datamodels: Predicting Predictions with Training Data"

Data from "Datamodels: Predicting Predictions with Training Data" Here we provid

Madry Lab 51 Dec 9, 2022
A data preprocessing package for time series data. Design for machine learning and deep learning.

A data preprocessing package for time series data. Design for machine learning and deep learning.

Allen Chiang 152 Jan 7, 2023
High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.

What is xLearn? xLearn is a high performance, easy-to-use, and scalable machine learning package that contains linear model (LR), factorization machin

Chao Ma 3k Jan 8, 2023
A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.

Master status: Development status: Package information: TPOT stands for Tree-based Pipeline Optimization Tool. Consider TPOT your Data Science Assista

Epistasis Lab at UPenn 8.9k Jan 9, 2023
Python Extreme Learning Machine (ELM) is a machine learning technique used for classification/regression tasks.

Python Extreme Learning Machine (ELM) Python Extreme Learning Machine (ELM) is a machine learning technique used for classification/regression tasks.

Augusto Almeida 84 Nov 25, 2022
CD) in machine learning projectsImplementing continuous integration & delivery (CI/CD) in machine learning projects

CML with cloud compute This repository contains a sample project using CML with Terraform (via the cml-runner function) to launch an AWS EC2 instance

Iterative 19 Oct 3, 2022
Python package for stacking (machine learning technique)

vecstack Python package for stacking (stacked generalization) featuring lightweight functional API and fully compatible scikit-learn API Convenient wa

Igor Ivanov 671 Dec 25, 2022
A Python Package to Tackle the Curse of Imbalanced Datasets in Machine Learning

imbalanced-learn imbalanced-learn is a python package offering a number of re-sampling techniques commonly used in datasets showing strong between-cla

null 6.2k Jan 1, 2023
Python package for machine learning for healthcare using a OMOP common data model

This library was developed in order to facilitate rapid prototyping in Python of predictive machine-learning models using longitudinal medical data from an OMOP CDM-standard database.

Sontag Lab 75 Jan 3, 2023
This is a Machine Learning model which predicts the presence of Diabetes in Patients

Diabetes Disease Prediction This is a machine Learning mode which tries to determine if a person has a diabetes or not. Data The dataset is in comma s

Edem Gold 4 Mar 16, 2022
Data science, Data manipulation and Machine learning package.

duality Data science, Data manipulation and Machine learning package. Use permitted according to the terms of use and conditions set by the attached l

David Kundih 3 Oct 19, 2022
A simple machine learning package to cluster keywords in higher-level groups.

Simple Keyword Clusterer A simple machine learning package to cluster keywords in higher-level groups. Example: "Senior Frontend Engineer" --> "Fronte

Andrea D'Agostino 10 Dec 18, 2022
Falken provides developers with a service that allows them to train AI that can play their games

Falken provides developers with a service that allows them to train AI that can play their games. Unlike traditional RL frameworks that learn through rewards or batches of offline training, Falken is based on training AI via realtime, human interactions.

Google Research 223 Jan 3, 2023
MIT-Machine Learning with Python–From Linear Models to Deep Learning

MIT-Machine Learning with Python–From Linear Models to Deep Learning | One of the 5 courses in MIT MicroMasters in Statistics & Data Science Welcome t

null 2 Aug 23, 2022
Microsoft contributing libraries, tools, recipes, sample codes and workshop contents for machine learning & deep learning.

Microsoft contributing libraries, tools, recipes, sample codes and workshop contents for machine learning & deep learning.

Microsoft 366 Jan 3, 2023