Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models

Overview

Molecular Sets (MOSES): A benchmarking platform for molecular generation models

Build Status PyPI version

Deep generative models are rapidly becoming popular for the discovery of new molecules and materials. Such models learn on a large collection of molecular structures and produce novel compounds. In this work, we introduce Molecular Sets (MOSES), a benchmarking platform to support research on machine learning for drug discovery. MOSES implements several popular molecular generation models and provides a set of metrics to evaluate the quality and diversity of generated molecules. With MOSES, we aim to standardize the research on molecular generation and facilitate the sharing and comparison of new models.

For more details, please refer to the paper.

If you are using MOSES in your research paper, please cite us as

@article{10.3389/fphar.2020.565644,
  title={{M}olecular {S}ets ({MOSES}): {A} {B}enchmarking {P}latform for {M}olecular {G}eneration {M}odels},
  author={Polykovskiy, Daniil and Zhebrak, Alexander and Sanchez-Lengeling, Benjamin and Golovanov, Sergey and Tatanov, Oktai and Belyaev, Stanislav and Kurbanov, Rauf and Artamonov, Aleksey and Aladinskiy, Vladimir and Veselov, Mark and Kadurin, Artur and Johansson, Simon and  Chen, Hongming and Nikolenko, Sergey and Aspuru-Guzik, Alan and Zhavoronkov, Alex},
  journal={Frontiers in Pharmacology},
  year={2020}
}

pipeline

Dataset

We propose a benchmarking dataset refined from the ZINC database.

The set is based on the ZINC Clean Leads collection. It contains 4,591,276 molecules in total, filtered by molecular weight in the range from 250 to 350 Daltons, a number of rotatable bonds not greater than 7, and XlogP less than or equal to 3.5. We removed molecules containing charged atoms or atoms besides C, N, S, O, F, Cl, Br, H or cycles longer than 8 atoms. The molecules were filtered via medicinal chemistry filters (MCFs) and PAINS filters.

The dataset contains 1,936,962 molecular structures. For experiments, we split the dataset into a training, test and scaffold test sets containing around 1.6M, 176k, and 176k molecules respectively. The scaffold test set contains unique Bemis-Murcko scaffolds that were not present in the training and test sets. We use this set to assess how well the model can generate previously unobserved scaffolds.

Models

Metrics

Besides standard uniqueness and validity metrics, MOSES provides other metrics to access the overall quality of generated molecules. Fragment similarity (Frag) and Scaffold similarity (Scaff) are cosine distances between vectors of fragment or scaffold frequencies correspondingly of the generated and test sets. Nearest neighbor similarity (SNN) is the average similarity of generated molecules to the nearest molecule from the test set. Internal diversity (IntDiv) is an average pairwise similarity of generated molecules. Fréchet ChemNet Distance (FCD) measures the difference in distributions of last layer activations of ChemNet. Novelty is a fraction of unique valid generated molecules not present in the training set.

Model Valid (↑) Unique@1k (↑) Unique@10k (↑) FCD (↓) SNN (↑) Frag (↑) Scaf (↑) IntDiv (↑) IntDiv2 (↑) Filters (↑) Novelty (↑)
Test TestSF Test TestSF Test TestSF Test TestSF
Train 1.0 1.0 1.0 0.008 0.4755 0.6419 0.5859 1.0 0.9986 0.9907 0.0 0.8567 0.8508 1.0 1.0
HMM 0.076±0.0322 0.623±0.1224 0.5671±0.1424 24.4661±2.5251 25.4312±2.5599 0.3876±0.0107 0.3795±0.0107 0.5754±0.1224 0.5681±0.1218 0.2065±0.0481 0.049±0.018 0.8466±0.0403 0.8104±0.0507 0.9024±0.0489 0.9994±0.001
NGram 0.2376±0.0025 0.974±0.0108 0.9217±0.0019 5.5069±0.1027 6.2306±0.0966 0.5209±0.001 0.4997±0.0005 0.9846±0.0012 0.9815±0.0012 0.5302±0.0163 0.0977±0.0142 0.8738±0.0002 0.8644±0.0002 0.9582±0.001 0.9694±0.001
Combinatorial 1.0±0.0 0.9983±0.0015 0.9909±0.0009 4.2375±0.037 4.5113±0.0274 0.4514±0.0003 0.4388±0.0002 0.9912±0.0004 0.9904±0.0003 0.4445±0.0056 0.0865±0.0027 0.8732±0.0002 0.8666±0.0002 0.9557±0.0018 0.9878±0.0008
CharRNN 0.9748±0.0264 1.0±0.0 0.9994±0.0003 0.0732±0.0247 0.5204±0.0379 0.6015±0.0206 0.5649±0.0142 0.9998±0.0002 0.9983±0.0003 0.9242±0.0058 0.1101±0.0081 0.8562±0.0005 0.8503±0.0005 0.9943±0.0034 0.8419±0.0509
AAE 0.9368±0.0341 1.0±0.0 0.9973±0.002 0.5555±0.2033 1.0572±0.2375 0.6081±0.0043 0.5677±0.0045 0.991±0.0051 0.9905±0.0039 0.9022±0.0375 0.0789±0.009 0.8557±0.0031 0.8499±0.003 0.996±0.0006 0.7931±0.0285
VAE 0.9767±0.0012 1.0±0.0 0.9984±0.0005 0.099±0.0125 0.567±0.0338 0.6257±0.0005 0.5783±0.0008 0.9994±0.0001 0.9984±0.0003 0.9386±0.0021 0.0588±0.0095 0.8558±0.0004 0.8498±0.0004 0.997±0.0002 0.6949±0.0069
JTN-VAE 1.0±0.0 1.0±0.0 0.9996±0.0003 0.3954±0.0234 0.9382±0.0531 0.5477±0.0076 0.5194±0.007 0.9965±0.0003 0.9947±0.0002 0.8964±0.0039 0.1009±0.0105 0.8551±0.0034 0.8493±0.0035 0.976±0.0016 0.9143±0.0058
LatentGAN 0.8966±0.0029 1.0±0.0 0.9968±0.0002 0.2968±0.0087 0.8281±0.0117 0.5371±0.0004 0.5132±0.0002 0.9986±0.0004 0.9972±0.0007 0.8867±0.0009 0.1072±0.0098 0.8565±0.0007 0.8505±0.0006 0.9735±0.0006 0.9498±0.0006

For comparison of molecular properties, we computed the Wasserstein-1 distance between distributions of molecules in the generated and test sets. Below, we provide plots for lipophilicity (logP), Synthetic Accessibility (SA), Quantitative Estimation of Drug-likeness (QED) and molecular weight.

logP SA
logP SA
weight QED
weight QED

Installation

PyPi

The simplest way to install MOSES (models and metrics) is to install RDKit: conda install -yq -c rdkit rdkit and then install MOSES (molsets) from pip (pip install molsets). If you want to use LatentGAN, you should also install additional dependencies using bash install_latentgan_dependencies.sh.

If you are using Ubuntu, you should also install sudo apt-get install libxrender1 libxext6 for RDKit.

Docker

  1. Install docker and nvidia-docker.

  2. Pull an existing image (4.1Gb to download) from DockerHub:

docker pull molecularsets/moses

or clone the repository and build it manually:

git clone https://github.com/molecularsets/moses.git
nvidia-docker image build --tag molecularsets/moses moses/
  1. Create a container:
nvidia-docker run -it --name moses --network="host" --shm-size 10G molecularsets/moses
  1. The dataset and source code are available inside the docker container at /moses:
docker exec -it molecularsets/moses bash

Manually

Alternatively, install dependencies and MOSES manually.

  1. Clone the repository:
git lfs install
git clone https://github.com/molecularsets/moses.git
  1. Install RDKit for metrics calculation.

  2. Install MOSES:

python setup.py install
  1. (Optional) Install dependencies for LatentGAN:
bash install_latentgan_dependencies.sh

Benchmarking your models

  • Install MOSES as described in the previous section.

  • Get train, test and test_scaffolds datasets using the following code:

import moses

train = moses.get_dataset('train')
test = moses.get_dataset('test')
test_scaffolds = moses.get_dataset('test_scaffolds')
  • You can use a standard torch DataLoader in your models. We provide a simple StringDataset class for convenience:
from torch.utils.data import DataLoader
from moses import CharVocab, StringDataset

train = moses.get_dataset('train')
vocab = CharVocab.from_data(train)
train_dataset = StringDataset(vocab, train)
train_dataloader = DataLoader(
    train_dataset, batch_size=512,
    shuffle=True, collate_fn=train_dataset.default_collate
)

for with_bos, with_eos, lengths in train_dataloader:
    ...
  • Calculate metrics from your model's samples. We recomend sampling at least 30,000 molecules:
import moses
metrics = moses.get_all_metrics(list_of_generated_smiles)
  • Add generated samples and metrics to your repository. Run the experiment multiple times to estimate the variance of the metrics.

Reproducing the baselines

End-to-End launch

You can run pretty much everything with:

python scripts/run.py

This will split the dataset, train the models, generate new molecules, and calculate the metrics. Evaluation results will be saved in metrics.csv.

You can specify the GPU device index as cuda:n (or cpu for CPU) and/or model by running:

python scripts/run.py --device cuda:1 --model aae

For more details run python scripts/run.py --help.

You can reproduce evaluation of all models with several seeds by running:

sh scripts/run_all_models.sh

Training

python scripts/train.py <model name> \
       --train_load <train dataset> \
       --model_save <path to model> \
       --config_save <path to config> \
       --vocab_save <path to vocabulary>

To get a list of supported models run python scripts/train.py --help.

For more details of certain model run python scripts/train.py <model name> --help.

Generation

python scripts/sample.py <model name> \
       --model_load <path to model> \
       --vocab_load <path to vocabulary> \
       --config_load <path to config> \
       --n_samples <number of samples> \
       --gen_save <path to generated dataset>

To get a list of supported models run python scripts/sample.py --help.

For more details of certain model run python scripts/sample.py <model name> --help.

Evaluation

python scripts/eval.py \
       --ref_path <reference dataset> \
       --gen_path <generated dataset>

For more details run python scripts/eval.py --help.

Comments
  • #2: Request for exact steps to run each of the 5 models and required packages with versions

    #2: Request for exact steps to run each of the 5 models and required packages with versions

    Following up from 4 days ago (Dec 14, 2019) issue...No luck for me. Instructions starting "Benchmarking your models" and below are actually not clear to me. Exactly what needs to be done step by step to run each of the 5 models? I would appreciate knowing the steps. For example, what exactly are the path --gen_path for the aae, charRNN, organ, latentgan and vae models under Training, Generation and Evaluation?. What special needs to be done for latentgan, e.g. re. ddc_pub v3, molvengen, ...? How exactly to execute sh scripts/run_all_models.sh? I am using Python 3.7.3, Jupyter Notebook 6.0.2, and have installed rdkit 2019.03.4.0, and molsets 0.2 and many other packages? Is there an exact requirements list of packages required to run MOSES?
    Will appreciate your clarifications. Thanks,

    opened by webservicereco 6
  • value of FCD Test and FCD Test SF

    value of FCD Test and FCD Test SF

    When I trained char-rnn model on my pc and got 30,000 samples from this generative model. After I evalutated the results between MOSES and my own , something weird happened, my FCD Test value and FCD TestSF value is much smaller than your results. So , why ?

    |char-rnn | MOSES| MY OWN RESULT| |----------|:-------------:|------:| |FCD (↓) Test | 0.355 | 02616| |FCD (↓) TestSF | 0.8995 | 0.7881|

    opened by xusworld 6
  • RuntimeError: 'lengths' argument should be a 1D CPU int64 tensor, but got 1D cuda:1 Long tensor

    RuntimeError: 'lengths' argument should be a 1D CPU int64 tensor, but got 1D cuda:1 Long tensor

    I run the code use cuda, but I face this problem, so how can I solvie the problem?

    Traceback (most recent call last): File "scripts/run.py", line 219, in main(config) File "scripts/run.py", line 199, in main train_model(config, model, train_path, test_path) File "scripts/run.py", line 127, in train_model trainer_script.main(model, trainer_config) File "/home/cyy/Code/moses/scripts/train.py", line 62, in main trainer.fit(model, train_data, val_data) File "/home/cyy/anaconda3/envs/moses/lib/python3.7/site-packages/moses/aae/trainer.py", line 284, in fit self._train(model, train_loader, val_loader, logger) File "/home/cyy/anaconda3/envs/moses/lib/python3.7/site-packages/moses/aae/trainer.py", line 219, in _train tqdm_data, criterions, optimizers) File "/home/cyy/anaconda3/envs/moses/lib/python3.7/site-packages/moses/aae/trainer.py", line 116, in _train_epoch latent_codes = model.encoder_forward(*encoder_inputs) File "/home/cyy/anaconda3/envs/moses/lib/python3.7/site-packages/moses/aae/model.py", line 109, in encoder_forward return self.encoder(*args, **kwargs) File "/home/cyy/anaconda3/envs/moses/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "/home/cyy/anaconda3/envs/moses/lib/python3.7/site-packages/moses/aae/model.py", line 26, in forward x = pack_padded_sequence(x, lengths, batch_first=True) File "/home/cyy/anaconda3/envs/moses/lib/python3.7/site-packages/torch/nn/utils/rnn.py", line 249, in pack_padded_sequence _VF._pack_padded_sequence(input, lengths, batch_first) RuntimeError: 'lengths' argument should be a 1D CPU int64 tensor, but got 1D cuda:1 Long tensor

    opened by viko-3 4
  • RuntimeError when executing the run.py

    RuntimeError when executing the run.py

    Following the instructions on the readme, I did an End-to-End launch, by running the run.py in the scripts folder.

    Traceback (most recent call last): File "run.py", line 208, in main(config) File "run.py", line 190, in main sample_from_model(config, model) File "run.py", line 127, in sample_from_model sampler_script.main(model, sampler_config) File "/content/moses/scripts/moses/scripts/sample.py", line 47, in main min(n, config.n_batch), config.max_len File "/usr/local/lib/python3.7/site-packages/molsets-0.1.4-py3.7.egg/moses/vae/model.py", line 218, in sample i_eos_mask = ~eos_mask & (w == self.eos) RuntimeError: Expected object of scalar type Byte but got scalar type Bool for argument #2 'other'

    How could I resolve this?

    opened by vandan-revanur 4
  • ImportError: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.21' not found  in CentOS after using FCD module

    ImportError: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.21' not found in CentOS after using FCD module

    This is the error message:

    (moses-env) [trial0@xps8500 moses]$ python scripts/train.py organ --help Traceback (most recent call last): File "scripts/train.py", line 5, in import rdkit File "/home/trial0/anaconda3/envs/moses-env/lib/python3.6/site-packages/rdkit/init.py", line 2, in from .rdBase import rdkitVersion as version ImportError: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by /home/trial0/anaconda3/envs/moses-env/lib/python3.6/site-packages/rdkit/rdBase.so)

    However, rdkit itself is working:

    (moses-env) [trial0@xps8500 ~]$ python Python 3.6.8 |Anaconda, Inc.| (default, Dec 30 2018, 01:22:34) [GCC 7.3.0] on linux Type "help", "copyright", "credits" or "license" for more information.

    from future import print_function from rdkit import Chem m = Chem.MolFromSmiles('Cc1ccccc1') m <rdkit.Chem.rdchem.Mol object at 0x7f1e30b8bf80>


    This under CentOS 7.6 after introducing the FCD module with pytorch 1.01. It looks this is not rdkit problem because ORGAN is still running under this env. without any problem. Any idea? Thanks!

    opened by toushi68 4
  • where is ddc_pub?

    where is ddc_pub?

    Hi, Did you forget to tell us how to specify "ddc_pub"? in from .model import LatentGAN moses/latentgan/model.py", line 4 from ddc_pub import ddc_v3 as ddc

    opened by toushi68 3
  • Insights on VAE's KL annealing scheme

    Insights on VAE's KL annealing scheme

    Hello,

    In most implementations of VAE for molecular generation on the web there seems to be a trend to downweight the KL penalty/suppress the reparametrization in the VAE training, basically degenerating the model into a standard AE. The reason being that a vanilla VAE seems to be unable to learn from a SMILES dataset otherwise.

    In Moses VAE's training scheme it seems to me that the KL penalty has a weighting factor that starts at 0 and grows linearly towards 1 with the number of epochs increasing.

    Do you find that this annealing of the KL term allows to get the best of both worlds, i.e not train just a plain AE while still being able to acheive low reconstruction error and high log-likelihood? Any insights on the impact of this scheme and of the KL penalty in general on the training of a VAE on a SMILES dataset?

    Thanks for your amazing repo!

    opened by maxime-langevin 3
  • multi gpu experiments

    multi gpu experiments

    Hi,

    Have there been any effort in scaling the models to multi-GPU systems for training?. I am not a molecular domain expert. I am curious if there are situations where using multi-GPU would help for the models used in the benchmark suite.

    Best, Trinayan

    opened by trinayan 2
  • Continue to have errors. Request Help. Please see attached 3 txt files.

    Continue to have errors. Request Help. Please see attached 3 txt files.

    opened by webservicereco 2
  • TypeError: get_all_metrics() got an unexpected keyword argument 'train'

    TypeError: get_all_metrics() got an unexpected keyword argument 'train'

    any modification from last update? I got an error message from: ########### python scripts/eval.py --n_jobs 4 --device cuda:0 --test_path data/test.csv --ptest_path data/test_stats.npz --test_scaffolds_path data/test_scaffolds.csv --ptest_scaffolds_path data/test_scaffolds_stats.npz --gen_path checkpoints/organ_generated.csv Traceback (most recent call last): File "scripts/eval.py", line 87, in main(config) File "scripts/eval.py", line 42, in main train=train) TypeError: get_all_metrics() got an unexpected keyword argument 'train' ############ It was OK before. Oh Happy 1024!

    opened by toushi68 2
  • Add a preliminary version of the LatentGAN

    Add a preliminary version of the LatentGAN

    In accordance with mail conversation, here is the part of the LatentGAN that has so far been made public. The heteroencoder will be published soon, after which the code refactoring will be performed.

    opened by SeemonJ 2
  • Error installing molsets due to dependency pomegranate==0.12.0

    Error installing molsets due to dependency pomegranate==0.12.0

    pip install molsets on Ubuntu 20.04.4 LTS with Python 3.8.13 and GCC 7.5.0 fails due to an error in installing a dependency.

    Building wheel for pomegranate (setup.py) resulted in the following error:

    building 'pomegranate.distributions.NeuralNetworkWrapper' extension
    
      gcc -pthread -B /home/dhanajayb/anaconda3/envs/DeepChem/compiler_compat -Wl,--sysroot=/ -Wsign-compare 
      -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC 
      -I/home/dhanajayb/anaconda3/envs/DeepChem/include/python3.8 
      -I/home/dhanajayb/anaconda3/envs/DeepChem/lib/python3.8/site-packages/numpy/core/include 
      -c pomegranate/distributions/NeuralNetworkWrapper.c 
      -o build/temp.linux-x86_643.8/pomegranate/distributions/NeuralNetworkWrapper.o
      
      gcc: error: pomegranate/distributions/NeuralNetworkWrapper.c: No such file or directory
      error: command '/usr/bin/gcc' failed with exit code 1
      ----------------------------------------
      ERROR: Failed building wheel for pomegranate
    

    pip install pomegranate successfully installs pomegranate-0.14.8 on this machine. Has anyone else experienced this issue?

    opened by dbhaskar92 0
  • JTN-VAE model implementation

    JTN-VAE model implementation

    Hi! Thanks a lot for your efforts in integrating these models. In the README, you show the result of JTN-VAE. However, it seems that there has no real implementation for JTN-VAE. Could you show me a way to use JTN-VAE? Thanks! c4620fae4db8cf70a13d4f590ab8e66

    opened by gcc17 0
  • ChemVAE support

    ChemVAE support

    Hi,

    Thanks for your wonderful software. I was just wondering since you have cited ChemVAE in your supported models whether you have implemented it. In VAE class I didn't see any GRU layer followed by conv1d layers as described in ChemVAE paper.

    Thanks, Mohsen

    opened by Naghipourfar 0
  • why does the variable 'vocab' there have the property - 'vectors'?

    why does the variable 'vocab' there have the property - 'vectors'?

    image Hi, I have some confusions about the variable **'vocab'** there. I'm the beginner in this field, so maybe these are some silly questions. But hopefully, you guys can help to give me some suggestions. Thanks!

    Firstly, whether the variable 'vocab' was generated by CharVocab.from_data function as mentioned in README.md. Additionally, if so, I couldn't find a property called 'vectors'. What the exactly type of the variable 'vocab' there? Whether I missed other key points on 'vocab'?

    opened by HandanDiana 1
  • Is the validity check of smiles in moses the same as RDKit?

    Is the validity check of smiles in moses the same as RDKit?

    I have this function to check the validity of smiles that is based on RDKit :

    from rdkit import Chem
    def checksmi(smi):
        m = Chem.MolFromSmiles(smi,sanitize=False)
        if m is None:
            #print('invalid SMILES')
            v = 0
        else:
            #print("valid smiles.")
            v = 1
            try:
                Chem.SanitizeMol(m)
            except:
                #print('invalid chemistry')
                v = 0
         return v
    

    is this the same as the validity given by moses when we type the below code, does moses validity checks for valid smiles (grammar) or also valid chemistry:

    import moses
    metrics = moses.get_all_metrics(list_to_evaluate)
    print(metrics)
    {'valid': 0.8571428571428572,
     'unique@1000': 1.0,
     'unique@10000': 1.0,
     'FCD/Test': 52.710485508527654,
     'SNN/Test': 0.2737954681118329,
     'Frag/Test': 0.26661035762724716,
     'Scaf/Test': nan,
     'FCD/TestSF': 54.30007202575979,
     'SNN/TestSF': 0.23380156854788461,
     'Frag/TestSF': 0.3777161249748274,
     'Scaf/TestSF': nan,
     'IntDiv': 0.7468284898334079,
     'IntDiv2': 0.578859979666881,
     'Filters': 0.6666666666666666,
     'logP': 3.048971913797608,
     'SA': 0.8860967344024802,
     'QED': 0.3780715536953819,
     'weight': 184.10358258270205,
     'Novelty': 1.0}
    
    opened by amine179 0
Owner
MOSES
A Benchmarking Platform for Molecular Generation Models
MOSES
Source code for the GPT-2 story generation models in the EMNLP 2020 paper "STORIUM: A Dataset and Evaluation Platform for Human-in-the-Loop Story Generation"

Storium GPT-2 Models This is the official repository for the GPT-2 models described in the EMNLP 2020 paper [STORIUM: A Dataset and Evaluation Platfor

Nader Akoury 27 Dec 20, 2022
Implementation of Learning Gradient Fields for Molecular Conformation Generation (ICML 2021).

[PDF] | [Slides] The official implementation of Learning Gradient Fields for Molecular Conformation Generation (ICML 2021 Long talk) Installation Inst

MilaGraph 117 Dec 9, 2022
Implementation of GeoDiff: a Geometric Diffusion Model for Molecular Conformation Generation (ICLR 2022).

GeoDiff: a Geometric Diffusion Model for Molecular Conformation Generation [OpenReview] [arXiv] [Code] The official implementation of GeoDiff: A Geome

Minkai Xu 155 Dec 26, 2022
source code for https://arxiv.org/abs/2005.11248 "Accelerating Antimicrobial Discovery with Controllable Deep Generative Models and Molecular Dynamics"

Accelerating Antimicrobial Discovery with Controllable Deep Generative Models and Molecular Dynamics This work will be published in Nature Biomedical

International Business Machines 71 Nov 15, 2022
A light-weight image labelling tool for Python designed for creating segmentation data sets.

An image labelling tool for creating segmentation data sets, for Django and Flask.

null 117 Nov 21, 2022
Blender Add-on that sets a Material's Base Color to one of Pantone's Colors of the Year

Blender PCOY (Pantone Color of the Year) MCMC (Mid-Century Modern Colors) HG71 (House & Garden Colors 1971) Blender Add-ons That Assign a Custom Color

Don Schnitzius 15 Nov 20, 2022
Revisiting, benchmarking, and refining Heterogeneous Graph Neural Networks.

Heterogeneous Graph Benchmark Revisiting, benchmarking, and refining Heterogeneous Graph Neural Networks. Roadmap We organize our repo by task, and on

THUDM 176 Dec 17, 2022
FedScale: Benchmarking Model and System Performance of Federated Learning

FedScale: Benchmarking Model and System Performance of Federated Learning (Paper) This repository contains scripts and instructions of building FedSca

null 268 Jan 1, 2023
Pip-package for trajectory benchmarking from "Be your own Benchmark: No-Reference Trajectory Metric on Registered Point Clouds", ECMR'21

Map Metrics for Trajectory Quality Map metrics toolkit provides a set of metrics to quantitatively evaluate trajectory quality via estimating consiste

Mobile Robotics Lab. at Skoltech 31 Oct 28, 2022
PyTorch code for EMNLP 2021 paper: Don't be Contradicted with Anything! CI-ToD: Towards Benchmarking Consistency for Task-oriented Dialogue System

Don’t be Contradicted with Anything!CI-ToD: Towards Benchmarking Consistency for Task-oriented Dialogue System This repository contains the PyTorch im

Libo Qin 25 Sep 6, 2022
RobustART: Benchmarking Robustness on Architecture Design and Training Techniques

The first comprehensive Robustness investigation benchmark on large-scale dataset ImageNet regarding ARchitecture design and Training techniques towards diverse noises.

null 132 Dec 23, 2022
PyTorch code for EMNLP 2021 paper: Don't be Contradicted with Anything! CI-ToD: Towards Benchmarking Consistency for Task-oriented Dialogue System

PyTorch code for EMNLP 2021 paper: Don't be Contradicted with Anything! CI-ToD: Towards Benchmarking Consistency for Task-oriented Dialogue System

Libo Qin 12 Sep 26, 2021
Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking

Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking We revisit and address issues with Oxford 5k and Paris 6k image retrieval benchm

Filip Radenovic 188 Dec 17, 2022
Official codebase for "B-Pref: Benchmarking Preference-BasedReinforcement Learning" contains scripts to reproduce experiments.

B-Pref Official codebase for B-Pref: Benchmarking Preference-BasedReinforcement Learning contains scripts to reproduce experiments. Install conda env

null 48 Dec 20, 2022
ColossalAI-Benchmark - Performance benchmarking with ColossalAI

Benchmark for Tuning Accuracy and Efficiency Overview The benchmark includes our

HPC-AI Tech 31 Oct 7, 2022
Code for the paper "Benchmarking and Analyzing Point Cloud Classification under Corruptions"

ModelNet-C Code for the paper "Benchmarking and Analyzing Point Cloud Classification under Corruptions". For the latest updates, see: sites.google.com

Jiawei Ren 45 Dec 28, 2022
Evaluation and Benchmarking of Speech Super-resolution Methods

Speech Super-resolution Evaluation and Benchmarking What this repo do: A toolbox for the evaluation of speech super-resolution algorithms. Unify the e

Haohe Liu (刘濠赫) 84 Dec 20, 2022
Repo for "Benchmarking Robustness of 3D Point Cloud Recognition against Common Corruptions" https://arxiv.org/abs/2201.12296

Benchmarking Robustness of 3D Point Cloud Recognition against Common Corruptions This repo contains the dataset and code for the paper Benchmarking Ro

Jiachen Sun 168 Dec 29, 2022