A collection of metrics for evaluating timbre dissimilarity using the TorchMetrics API

Ben Hayes

Last update: Jan 5, 2022

Related tags

Deep Learning timbre-dissimilarity-metrics

Overview

Timbre Dissimilarity Metrics

A collection of metrics for evaluating timbre dissimilarity using the TorchMetrics API

Installation

pip install -e .

Usage

import timbremetrics

datasets = timbremetrics.list_datasets()
dataset = datasets[0] # get the first timbre dataset

# MAE between target dataset and pred embedding distances
metric = timbremetrics.TimbreMAE(
    margin=0.0, dataset=dataset, distance=timbremetrics.l1
)

# get numpy audio for the timbre dataset
audio = timbremetrics.get_audio(dataset)

# get arbitrary embeddings for the timbre dataset's audio
embeddings = net(audio)

# compute the metric
metric(embeddings)

Metrics

The following metrics are implemented.

Mean Squared Error

Gives the mean squared error between the upper triangles of the predicted distance matrix and target distance matrix:

Mean Absolute Error

Gives the mean squared error between the upper triangles of the predicted distance matrix and target distance matrix:

Item Rank Agreement

Gives the proportion of distances ranked per-item that match between the predicted distance matrix and target distance matrix.

Where is the indicator function given by:

and & are distances matrices ranked per item such that each row contains the ordinal distances from the corresponding item. We also provide a top-k version which computes this metric considering only the closest k items in each row.

Triplet Agreement

Samples pseudo-triplets from the target distance matrix according to a positivity radius and margin, and returns the proportion of these triplets for which ordering is retained in the predicted distance matrix, with the margin optionally enforced.

Mantel Test

The Mantel test computes Pearson's r or Spearman's rho on the condensed form of the upper triangles of the predicted and target distance matrices. The significance of the given result can be estimated using permutation analysis.

You might also like...

Resources for the "Evaluating the Factual Consistency of Abstractive Text Summarization" paper

Evaluating the Factual Consistency of Abstractive Text Summarization Authors: Wojciech Kryściński, Bryan McCann, Caiming Xiong, and Richard Socher Int

165 Dec 21, 2022

Evaluating different engineering tricks that make RL work

Reinforcement Learning Tricks, Index This repository contains the code for the paper "Distilling Reinforcement Learning Tricks for Video Games". Short

15 Dec 26, 2022

BARTScore: Evaluating Generated Text as Text Generation

This is the Repo for the paper: BARTScore: Evaluating Generated Text as Text Generation Updates 2021.06.28 Release online evaluation Demo 2021.06.25 R

196 Dec 17, 2022

🤖 A Python library for learning and evaluating knowledge graph embeddings

PyKEEN PyKEEN (Python KnowlEdge EmbeddiNgs) is a Python package designed to train and evaluate knowledge graph embedding models (incorporating multi-m

1.1k Jan 9, 2023

Code Repo for the ACL21 paper "Common Sense Beyond English: Evaluating and Improving Multilingual LMs for Commonsense Reasoning"

Common Sense Beyond English: Evaluating and Improving Multilingual LMs for Commonsense Reasoning This is the Github repository of our paper, "Common S

19 Nov 30, 2022

Benchmark for evaluating open-ended generation

OpenMEVA Contributed by Jian Guan, Zhexin Zhang. Thank Jiaxin Wen for DeBugging. OpenMEVA is a benchmark for evaluating open-ended story generation me

25 Nov 15, 2022

Tracing Versus Freehand for Evaluating Computer-Generated Drawings (SIGGRAPH 2021)

Tracing Versus Freehand for Evaluating Computer-Generated Drawings (SIGGRAPH 2021) Zeyu Wang, Sherry Qiu, Nicole Feng, Holly Rushmeier, Leonard McMill

23 Dec 9, 2022

Tensorflow 2 implementation of the paper: Learning and Evaluating Representations for Deep One-class Classification published at ICLR 2021

Deep Representation One-class Classification (DROC). This is not an officially supported Google product. Tensorflow 2 implementation of the paper: Lea

137 Dec 23, 2022

This repo includes our code for evaluating and improving transferability in domain generalization (NeurIPS 2021)

Transferability for domain generalization This repo is for evaluating and improving transferability in domain generalization (NeurIPS 2021), based on

9 Nov 29, 2022

A collection of metrics for evaluating timbre dissimilarity using the TorchMetrics API

Related tags

Overview

Timbre Dissimilarity Metrics

Installation

Usage

Metrics

Mean Squared Error

Mean Absolute Error

Item Rank Agreement

Triplet Agreement

Mantel Test

You might also like...

Resources for the "Evaluating the Factual Consistency of Abstractive Text Summarization" paper

Evaluating different engineering tricks that make RL work

BARTScore: Evaluating Generated Text as Text Generation

🤖 A Python library for learning and evaluating knowledge graph embeddings

Code Repo for the ACL21 paper "Common Sense Beyond English: Evaluating and Improving Multilingual LMs for Commonsense Reasoning"

Benchmark for evaluating open-ended generation

Tracing Versus Freehand for Evaluating Computer-Generated Drawings (SIGGRAPH 2021)

Tensorflow 2 implementation of the paper: Learning and Evaluating Representations for Deep One-class Classification published at ICLR 2021

This repo includes our code for evaluating and improving transferability in domain generalization (NeurIPS 2021)

Owner

Ben Hayes

📚 A collection of all the Deep Learning Metrics that I came across which are not accuracy/loss.

A library for preparing, training, and evaluating scalable deep learning hybrid recommender systems using PyTorch.

A library for preparing, training, and evaluating scalable deep learning hybrid recommender systems using PyTorch.

Using Streamlit to host a multi-page tool with model specs and classification metrics, while also accepting user input values for prediction.

Object detection evaluation metrics using Python.

High-level library to help with training and evaluating neural networks in PyTorch flexibly and transparently.

UNION: An Unreferenced Metric for Evaluating Open-ended Story Generation

Official repository for the ICLR 2021 paper Evaluating the Disentanglement of Deep Generative Models with Manifold Topology

Repository for XLM-T, a framework for evaluating multilingual language models on Twitter data