Machine learning evaluation metrics, implemented in Python, R, Haskell, and MATLAB / Octave

Overview

Note: the current releases of this toolbox are a beta release, to test working with Haskell's, Python's, and R's code repositories.

Build Status

Metrics provides implementations of various supervised machine learning evaluation metrics in the following languages:

  • Python easy_install ml_metrics
  • R install.packages("Metrics") from the R prompt
  • Haskell cabal install Metrics
  • MATLAB / Octave (clone the repo & run setup from the MATLAB command line)

For more detailed installation instructions, see the README for each implementation.

EVALUATION METRICS

Evaluation Metric Python R Haskell MATLAB / Octave
Absolute Error (AE)
Average Precision at K (APK, AP@K)
Area Under the ROC (AUC)
Classification Error (CE)
F1 Score (F1)
Gini
Levenshtein
Log Loss (LL)
Mean Log Loss (LogLoss)
Mean Absolute Error (MAE)
Mean Average Precision at K (MAPK, MAP@K)
Mean Quadratic Weighted Kappa
Mean Squared Error (MSE)
Mean Squared Log Error (MSLE)
Normalized Gini
Quadratic Weighted Kappa
Relative Absolute Error (RAE)
Root Mean Squared Error (RMSE)
Relative Squared Error (RSE)
Root Relative Squared Error (RRSE)
Root Mean Squared Log Error (RMSLE)
Squared Error (SE)
Squared Log Error (SLE)

TO IMPLEMENT

  • F1 score
  • Multiclass log loss
  • Lift
  • Average Precision for binary classification
  • precision / recall break-even point
  • cross-entropy
  • True Pos / False Pos / True Neg / False Neg rates
  • precision / recall / sensitivity / specificity
  • mutual information

HIGHER LEVEL TRANSFORMATIONS TO HANDLE

  • GroupBy / Reduce
  • Weight individual samples or groups

PROPERTIES METRICS CAN HAVE

(Nonexhaustive and to be added in the future)

  • Min or Max (optimize through minimization or maximization)
  • Binary Classification
    • Scores predicted class labels
    • Scores predicted ranking (most likely to least likely for being in one class)
    • Scores predicted probabilities
  • Multiclass Classification
    • Scores predicted class labels
    • Scores predicted probabilities
  • Regression
  • Discrete Rater Comparison (confusion matrix)
Comments
  • Automatically run 2to3 when installing on Python 3

    Automatically run 2to3 when installing on Python 3

    I added a couple lines to setup.py to make 2to3 get run automatically by Distribute on Python 3 because we were running into some issues installing this on 3.3. I just followed the recommended steps here.

    opened by dan-blanchard 7
  • Become maintainer of this package

    Become maintainer of this package

    Hi

    Do you have any interest in being the maintainer of this package? If not, would you mind if I help revive its status on CRAN?

    Thanks Michael Frasco

    opened by mfrasco 3
  • ml_metrics fails to install via pip

    ml_metrics fails to install via pip

    $ pip --version pip 1.4.1 from /usr/local/lib/python2.7/site-packages/pip-1.4.1-py2.7.egg (python 2.7

    $ pip install ml_metrics Downloading/unpacking ml-metrics Downloading ml_metrics-0.1.3.zip Running setup.py egg_info for package ml-metrics Traceback (most recent call last): File "", line 16, in File "/Users/ndronen/Source/dissertation/projects/iclr-2014/build/ml-metrics/setup.py", line 6, in requirements = [x.strip() for x in open("requirements.txt")] IOError: [Errno 2] No such file or directory: 'requirements.txt' Complete output from command python setup.py egg_info: Traceback (most recent call last):

    File "", line 16, in

    File "/Users/ndronen/Source/build/ml-metrics/setup.py", line 6, in

    requirements = [x.strip() for x in open("requirements.txt")]
    

    IOError: [Errno 2] No such file or directory: 'requirements.txt'

    opened by ndronen 3
  • I have used kappa metric provided here and getting a near zero values for complete disagreement instead of -1. Am i missing something?

    I have used kappa metric provided here and getting a near zero values for complete disagreement instead of -1. Am i missing something?

    x = np.array([0,3,2,4,0,2,0,4,3,0,2])  
    y = np.array([2,1,3,2,3,4,2,1,4,3,1])
    print (kappa(x,y,min_rating = 0, max_rating = 4))
    
    
    [complete disagreement]
    -0.18627450980392224
    
    opened by prakashjayy 2
  • Suggested Metric: Mean Absolute Scaled Error

    Suggested Metric: Mean Absolute Scaled Error

    Mean Absolute Scaled Error (MASE) is pretty widely used in econometrics and is one of the best known metrics for forecasting. I wanted to post that it may be appropriate for this package as well

    http://robjhyndman.com/papers/foresight.pdf

    opened by SteveBronder 2
  • Removed extra

    Removed extra "reduce" statements.

    min supports lists directly, so reduce(min, rater_a) was verbose (and potentially less efficient).

    I also improved the PEP8 compliance by getting rid of TABs used for indentation and adding spaces around operators.

    opened by dan-blanchard 2
  • Metrics::auc fails due to integer overflow

    Metrics::auc fails due to integer overflow

    The auc function cannot support large datasets due to integer overflow. The algorithm that the function uses multiplies the number of positive cases by the number of negative cases. If either of these numbers is large enough, there can be integer overflow.

    Would you be open to a pull request that fixed this bug?

    opened by mfrasco 1
  • Metrics R package has been orphaned on CRAN

    Metrics R package has been orphaned on CRAN

    Hi Ben, I just noticed that the maintainer status of the Metrics R package has been changed to "ORPHANED" on April 21, 2017. The CRAN maintainers must have sent you some emails about issues with the package and couldn't reach you so after a certain amount of time, they set the maintainer to "ORPHANED" and incremented the package version number to 0.1.2.

    I fixed the CRAN issues, made updates to the documentation, added examples to all the functions, and incremented the version number to 0.1.3. I've pushed the updates, which you can review on my fork here. Are you interested in re-establishing yourself as the maintainer? If so, I'll submit a PR with my changes and you can submit version 0.1.3 to CRAN directly. If not, let me know and I can help you find someone to take over as the maintainer and have them submit version 0.1.3 to CRAN.

    CRAN check output from running R CMD check --as-cran Metrics_0.1.3.tar.gz:

    * using log directory ‘/Users/me/code/github-myforks/Metrics/Metrics.Rcheck’
    * using R version 3.3.2 (2016-10-31)
    * using platform: x86_64-apple-darwin13.4.0 (64-bit)
    * using session charset: UTF-8
    * using option ‘--as-cran’
    * checking for file ‘Metrics/DESCRIPTION’ ... OK
    * this is package ‘Metrics’ version ‘0.1.3’
    * checking CRAN incoming feasibility ... NOTE
    Maintainer: ‘Ben Hamner <[email protected]>’
    
    Days since last update: 4
    
    New maintainer:
      Ben Hamner <[email protected]>
    Old maintainer(s):
      ORPHANED
    
    License components with restrictions and base license permitting such:
      BSD_3_clause + file LICENSE
    File 'LICENSE':
      YEAR: 2012-2017
      COPYRIGHT HOLDER: Ben Hamner
      ORGANIZATION: copyright holder
    
    CRAN repository db overrides:
      X-CRAN-Comment: Orphaned and corrected on 2017-04-21 as check errors
        were not corrected despite reminders.
      Maintainer: ORPHANED
    CRAN repository db conflicts: ‘Maintainer’
    * checking package namespace information ... OK
    * checking package dependencies ... OK
    * checking if this is a source package ... OK
    * checking if there is a namespace ... OK
    * checking for executable files ... OK
    * checking for hidden files and directories ... OK
    * checking for portable file names ... OK
    * checking for sufficient/correct file permissions ... OK
    * checking whether package ‘Metrics’ can be installed ... OK
    * checking installed package size ... OK
    * checking package directory ... OK
    * checking DESCRIPTION meta-information ... OK
    * checking top-level files ... OK
    * checking for left-over files ... OK
    * checking index information ... OK
    * checking package subdirectories ... OK
    * checking R files for non-ASCII characters ... OK
    * checking R files for syntax errors ... OK
    * checking whether the package can be loaded ... OK
    * checking whether the package can be loaded with stated dependencies ... OK
    * checking whether the package can be unloaded cleanly ... OK
    * checking whether the namespace can be loaded with stated dependencies ... OK
    * checking whether the namespace can be unloaded cleanly ... OK
    * checking use of S3 registration ... OK
    * checking dependencies in R code ... OK
    * checking S3 generic/method consistency ... OK
    * checking replacement functions ... OK
    * checking foreign function calls ... OK
    * checking R code for possible problems ... OK
    * checking Rd files ... OK
    * checking Rd metadata ... OK
    * checking Rd line widths ... OK
    * checking Rd cross-references ... OK
    * checking for missing documentation entries ... OK
    * checking for code/documentation mismatches ... OK
    * checking Rd \usage sections ... OK
    * checking Rd contents ... OK
    * checking for unstated dependencies in examples ... OK
    * checking examples ... OK
    * checking PDF version of manual ... OK
    * DONE
    
    Status: 1 NOTE
    See
      ‘/Users/me/code/github-myforks/Metrics/Metrics.Rcheck/00check.log’
    for details.
    
    opened by ledell 1
  • Bumped up version number in setup.py to make pip install latest version with Python 3 fixes.

    Bumped up version number in setup.py to make pip install latest version with Python 3 fixes.

    Currently, the Python ml_metrics package on PyPI is not Python 3 compatible because it is out-of-date. I've bumped the version number up to avoid any conflicts, so if you could please run "python setup.py register" and "python setup.py sdist upload" so this latest version gets up there, that would be extremely helpful.

    We at ETS have recently released a ML package, SciKit-Learn Laboratory (SKLL) that relies on ml_metrics for kappa, and we don't want to have to repackage your code with ours unless absolutely necessary.

    opened by dan-blanchard 1
  • Forgot an `import sys` in `setup.py`?

    Forgot an `import sys` in `setup.py`?

    It's such a small thing I think I may be the one missing something here.

    I cloned and ran python setup.py build and ran into

    Traceback (most recent call last):
      File "setup.py", line 9, in <module>
        if sys.version_info >= (3,):
    NameError: name 'sys' is not defined
    

    Of course, doing an import sys fixes it right up.

    Just thought I'd let you know!

    opened by vietjtnguyen 1
  • Something seems wrong with kappa

    Something seems wrong with kappa

    Adding tests which currently fail -- so we have something to check against.

    Also -- I am not familiar with git, so lets hope I am pushing the right buttons,

    opened by OlexiyO 1
  • AP@K Calculate

    AP@K Calculate

    https://github.com/benhamner/Metrics/blob/9a637aea795dc6f2333f022b0863398de0a1ca77/Python/ml_metrics/average_precision.py#L32

    Hello: I notice that the reuslt from running apk([1, 1, 1], [1, 1, 1], 3) does not return 1. I wonder if it should be if p in actual and p in predicted[:i+1]? Thank you.

    opened by yanyijiang09 0
  • Installation problem

    Installation problem

    when tried to install with pip in virtual environment, it throws an error as below.

    pip install ml_metrics Collecting ml_metrics Using cached ml_metrics-0.1.4.tar.gz (5.0 kB) Preparing metadata (setup.py) ... error error: subprocess-exited-with-error

    × python setup.py egg_info did not run successfully. │ exit code: 1 ╰─> [1 lines of output] error in ml_metrics setup command: use_2to3 is invalid. [end of output]

    note: This error originates from a subprocess, and is likely not a problem with pip. error: metadata-generation-failed

    × Encountered error while generating package metadata. ╰─> See above for output.

    note: This is an issue with the package mentioned above, not pip. hint: See above for details.

    opened by chaitanya-kolliboyina 0
  • Fix average precision at k calculation

    Fix average precision at k calculation

    This PR fixes #49 According to the Wikipedia page of Average Precision the equation is defined as follow: image where rel(k) is an indicator function equaling 1 if the item at rank k is a relevant document, zero otherwise. Note that the average is over all relevant documents, and the relevant documents not retrieved get a precision score of zero. Before, the average was calculated over the minimum value between the length of the actual value and k. This doesn't seem right since the length of the actual list of k increases; the AP@K will decrease. I fixed and cleaned up the code. Please consider merging this! This could lead to many mistakes.

    opened by raminqaf 0
  • wrong ap@k

    wrong ap@k

    After I run the code in my anaconda3

    pip install ml_metrics Collecting ml_metrics Requirement already satisfied: numpy in /home/westwood/anaconda3/lib/python3.7/site-packages (from ml_metrics) (1.15.1) Requirement already satisfied: pandas in /home/westwood/anaconda3/lib/python3.7/site-packages (from ml_metrics) (0.23.4) Requirement already satisfied: python-dateutil>=2.5.0 in /home/westwood/anaconda3/lib/python3.7/site-packages (from pandas->ml_metrics) (2.7.3) Requirement already satisfied: pytz>=2011k in /home/westwood/anaconda3/lib/python3.7/site-packages (from pandas->ml_metrics) (2018.5) Requirement already satisfied: six>=1.5 in /home/westwood/anaconda3/lib/python3.7/site-packages (from python-dateutil>=2.5.0->pandas->ml_metrics) (1.11.0) Installing collected packages: ml-metrics Successfully installed ml-metrics-0.1.4

    In the file

    image

    Is wrong !!!

    And different with

    image

    opened by MentalOmega 2
Owner
Ben Hamner
Co-founder and CTO of Kaggle
Ben Hamner
Tensors and neural networks in Haskell

Hasktorch Hasktorch is a library for tensors and neural networks in Haskell. It is an independent open source community project which leverages the co

hasktorch 920 Jan 4, 2023
Pytorch Lightning 1.2k Jan 6, 2023
Object detection evaluation metrics using Python.

Object detection evaluation metrics using Python.

Louis Facun 2 Sep 6, 2022
Provide baselines and evaluation metrics of the task: traffic flow prediction

Note: This repo is adpoted from https://github.com/UNIMIBInside/Smart-Mobility-Prediction. Due to technical reasons, I did not fork their code. Introd

Zhangzhi Peng 11 Nov 2, 2022
The project covers common metrics for super-resolution performance evaluation.

Super-Resolution Performance Evaluation Code The project covers common metrics for super-resolution performance evaluation. Metrics support The script

xmy 10 Aug 3, 2022
An efficient PyTorch implementation of the evaluation metrics in recommender systems.

recsys_metrics An efficient PyTorch implementation of the evaluation metrics in recommender systems. Overview • Installation • How to use • Benchmark

Xingdong Zuo 12 Dec 2, 2022
null 202 Jan 6, 2023
Numba-accelerated Pythonic implementation of MPDATA with examples in Python, Julia and Matlab

PyMPDATA PyMPDATA is a high-performance Numba-accelerated Pythonic implementation of the MPDATA algorithm of Smolarkiewicz et al. used in geophysical

Atmospheric Cloud Simulation Group @ Jagiellonian University 15 Nov 23, 2022
Pythonic particle-based (super-droplet) warm-rain/aqueous-chemistry cloud microphysics package with box, parcel & 1D/2D prescribed-flow examples in Python, Julia and Matlab

PySDM PySDM is a package for simulating the dynamics of population of particles. It is intended to serve as a building block for simulation systems mo

Atmospheric Cloud Simulation Group @ Jagiellonian University 32 Oct 18, 2022
Matlab Python Heuristic Battery Opt - SMOP conversion and manual conversion

SMOP is Small Matlab and Octave to Python compiler. SMOP translates matlab to py

Tom Xu 1 Jan 12, 2022
Use MATLAB to simulate the signal and extract features. Use PyTorch to build and train deep network to do spectrum sensing.

Deep-Learning-based-Spectrum-Sensing Use MATLAB to simulate the signal and extract features. Use PyTorch to build and train deep network to do spectru

null 10 Dec 14, 2022
MATLAB codes of the book "Digital Image Processing Fourth Edition" converted to Python

Digital Image Processing Python MATLAB codes of the book "Digital Image Processing Fourth Edition" converted to Python TO-DO: Refactor scripts, curren

Merve Noyan 24 Oct 16, 2022
Reference implementation of code generation projects from Facebook AI Research. General toolkit to apply machine learning to code, from dataset creation to model training and evaluation. Comes with pretrained models.

This repository is a toolkit to do machine learning for programming languages. It implements tokenization, dataset preprocessing, model training and m

Facebook Research 408 Jan 1, 2023
Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep learning.

Machine Learning From Scratch About Python implementations of some of the fundamental Machine Learning models and algorithms from scratch. The purpose

Erik Linder-Norén 21.8k Jan 9, 2023
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

This is the Vowpal Wabbit fast online learning code. Why Vowpal Wabbit? Vowpal Wabbit is a machine learning system which pushes the frontier of machin

Vowpal Wabbit 8.1k Jan 6, 2023
📚 A collection of all the Deep Learning Metrics that I came across which are not accuracy/loss.

?? A collection of all the Deep Learning Metrics that I came across which are not accuracy/loss.

Rahul Vigneswaran 1 Jan 17, 2022