Training BERT with Compute/Time (Academic) Budget

Overview

Training BERT with Compute/Time (Academic) Budget

This repository contains scripts for pre-training and finetuning BERT-like models with limited time and compute budget. The code is based on the work presented in the following paper:

Peter Izsak, Moshe Berchansky, Omer Levy, How to Train BERT with an Academic Budget - (to appear at EMNLP 2021).

Installation

The pre-training and finetuning scripts are based on Deepspeed and HuggingFace Transformers libraries.

Preliminary Installation

We recommend creating a virtual environment with python 3.6+, PyTorch and apex.

Installation Requirements

pip install -r requirements.txt

We suggest running Deepspeed's utility ds_report and verify Deepspeed components can be compiled (JIT).

Dataset

The dataset directory includes scripts to pre-process the datasets we used in our experiments (Wikipedia, Bookcorpus). See dedicated README for full details.

Pretraining

Pretraining script: run_pretraining.py

For all possible pretraining arguments see: python run_pretraining.py -h

We highly suggest reviewing the various training features we provide within the library.

Example for training with the best configuration presented in our paper (24-layers/1024H/time-based learning rate schedule/fp16):
deepspeed run_pretraining.py \
  --model_type bert-mlm --tokenizer_name bert-large-uncased \
  --hidden_act gelu \
  --hidden_size 1024 \
  --num_hidden_layers 24 \
  --num_attention_heads 16 \
  --intermediate_size 4096 \
  --hidden_dropout_prob 0.1 \
  --attention_probs_dropout_prob 0.1 \
  --encoder_ln_mode pre-ln \
  --lr 1e-3 \
  --train_batch_size 4096 \
  --train_micro_batch_size_per_gpu 32 \
  --lr_schedule time \
  --curve linear \
  --warmup_proportion 0.06 \
  --gradient_clipping 0.0 \
  --optimizer_type adamw \
  --weight_decay 0.01 \
  --adam_beta1 0.9 \
  --adam_beta2 0.98 \
  --adam_eps 1e-6 \
  --total_training_time 24.0 \
  --early_exit_time_marker 24.0 \
  --dataset_path <dataset path> \
  --output_dir /tmp/training-out \
  --print_steps 100 \
  --num_epochs_between_checkpoints 10000 \
  --job_name pretraining_experiment \
  --project_name budget-bert-pretraining \
  --validation_epochs 3 \
  --validation_epochs_begin 1 \
  --validation_epochs_end 1 \
  --validation_begin_proportion 0.05 \
  --validation_end_proportion 0.01 \
  --validation_micro_batch 16 \
  --deepspeed \
  --data_loader_type dist \
  --do_validation \
  --use_early_stopping \
  --early_stop_time 180 \
  --early_stop_eval_loss 6 \
  --seed 42 \
  --fp16

Time-based Training

Pretraining can be limited to a time-based value by defining --total_training_time=24.0 (24 hours for example).

Time-based Learning Rate Scheduling

The learning rate can be scheduled to change according to the configured total training time. The argument --total_training_time controls the total time assigned for the trainer to run, and must be specified in order to use time-based learning rate scheduling.

Time-based Learning rate schedule

To select time-based learning rate scheduling, define --lr_schedule time, and define a shape for for the annealing curve (--curve=linear for example, as seen in the figure). The warmup phase of the learning rate is define by specifying a proportion (--warmup_proportion) which accounts for the time-budget proportion available in the training session (as defined by --total_training_time). For example, for a 24 hour training session, warmup_proportion=0.1 would account for 10% of 24 hours, that is, 2.4 hours (or 144 minutes) to reach peak learning rate. The learning rate will then be scheduled to reach 0 at the end of the time budget. We refer to the provided figure for an example.

Checkpoints and Finetune Checkpoints

There are 2 types of checkpoints that can be enabled:

  • Training checkpoint - saves model weights, optimizer state and training args. Defined by --num_epochs_between_checkpoints.
  • Finetuning checkpoint - saves model weights and configuration to be used for finetuning later on. Defined by --finetune_time_markers.

finetune_time_markers can be assigned multiple points in the training time-budget by providing a list of time markers of the overall training progress. For example --finetune_time_markers=0.5 will save a finetuning checkpoint when reaching 50% of training time budget. For multiple finetuning checkpoints, use commas without space 0.5,0.6,0.9.

Validation Scheduling

Enable validation while pre-training with --do_validation

Control the number of epochs between validation runs with --validation_epochs=

To control the amount of validation runs in the beginning and end (running more that validation_epochs) use validation_begin_proportion and validation_end_proportion to specify the proportion of time and, validation_epochs_begin and validation_epochs_end to control the custom values accordingly.

Mixed Precision Training

Mixed precision is supported by adding --fp16. Use --fp16_backend=ds to use Deepspeed's mixed precision backend and --fp16_backend=apex for apex (--fp16_opt controls optimization level).

Finetuning

Use run_glue.py to run finetuning for a saved checkpoint on GLUE tasks.

The finetuning script is identical to the one provided by Huggingface with the addition of our model.

For all possible pretraining arguments see: python run_glue.py -h

Example for finetuning on MRPC:
python run_glue.py \
  --model_name_or_path <path to model> \
  --task_name MRPC \
  --max_seq_length 128 \
  --output_dir /tmp/finetuning \
  --overwrite_output_dir \
  --do_train --do_eval \
  --evaluation_strategy steps \
  --per_device_train_batch_size 32 --gradient_accumulation_steps 1 \
  --per_device_eval_batch_size 32 \
  --learning_rate 5e-5 \
  --weight_decay 0.01 \
  --eval_steps 50 --evaluation_strategy steps \
  --max_grad_norm 1.0 \
  --num_train_epochs 5 \
  --lr_scheduler_type polynomial \
  --warmup_steps 50

Generating Pretraining Commands

We provide a useful script for generating multiple (or single) pretraining commands by using python generate_training_commands.py.

python generate_training_commands.py -h

	--param_file PARAM_FILE Hyperparameter and configuration yaml
  	--job_name JOB_NAME   job name
 	--init_cmd INIT_CMD   initialization command (deepspeed or python directly)

A parameter yaml must be defined with 2 main keys: hyperparameters with argument values defined as a list of possible values, and default_parameters as default values. Each generated command will be a possible combination of the various arguments specified in the hyperparameters section.

Example:

hyperparameters:
  param1: [val1, val2]
  param2: [val1, val2]

default_parameters:
  param3: 0.0

will result in:

deepspeed run_pretraining.py --param1=val1 --param2=val1 --param3=0.0
deepspeed run_pretraining.py --param1=val1 --param2=val2 --param3=0.0
deepspeed run_pretraining.py --param1=val2 --param2=val1 --param3=0.0
deepspeed run_pretraining.py --param1=val2 --param2=val2 --param3=0.0

Citation

If you find this paper or this code useful, please cite this paper:

@article{izsak2021,
  author={Izsak, Peter and Berchansky, Moshe and Levy, Omer},
  title={How to Train BERT with an Academic Budget},
  journal={arXiv preprint arXiv:2104.07705},
  url = {https://arxiv.org/abs/2104.07705} 
  year={2021}
}
Comments
  • GLUE results not reproducible

    GLUE results not reproducible

    Hello,

    I understand the results mentioned in paper for GLUE are for test set but we are not able to reproduce them. Our pretrained model had a loss of 1.72 and after sweeping through the hyper-parameters mentioned in Table 7 of paper, the best score that we got on CoLA is 43% (in validation set) and as per table 4 your test result is 57.1%. We did finetuning on 1 GPU.

    Are we missing something ?

    opened by lumliolum 11
  • Unable to train a roberta model?

    Unable to train a roberta model?

    Hi everyone, thanks for publishing your code :) I've been trying to run your code on some data I had (just to see how it goes) and it seems that something is stalling right after the training has been initialized.

    you can find the log here: https://pastebin.com/Z2CrZA9C

    when I ctrl+C the script, things seemed to be stalling at a subprocess creation point in deepspeed (whether it's 10s or 30mn)

    ^CKilling subprocess 3401541 Killing subprocess 3401542 Main process received SIGINT, exiting Traceback (most recent call last): File "/home/ROCQ/alpage/seddah/src/miniconda3/envs/budgetBERT/bin/deepspeed", line 6, in main() File "/home/ROCQ/alpage/seddah/src/miniconda3/envs/budgetBERT/lib/python3.9/site-packages/deepspeed/launcher/runner.py", line 362, in main result.wait() File "/home/ROCQ/alpage/seddah/src/miniconda3/envs/budgetBERT/lib/python3.9/subprocess.py", line 1189, in wait return self._wait(timeout=timeout) File "/home/ROCQ/alpage/seddah/src/miniconda3/envs/budgetBERT/lib/python3.9/subprocess.py", line 1917, in _wait (pid, sts) = self._try_wait(0) File "/home/ROCQ/alpage/seddah/src/miniconda3/envs/budgetBERT/lib/python3.9/subprocess.py", line 1875, in _try_wait (pid, sts) = os.waitpid(self.pid, wait_flags) KeyboardInterrupt

    is there any chance you'd have an idea?

    Best, Djamé ps: say hi to Omer :)

    opened by dseddah 10
  • Question about validation and testing

    Question about validation and testing

    Hello,

    I wonder if there is a train-validation-test split or a train-test split only. I'm asking because during the dataset generation train-* and test-* files are generated, but no valid-* files. Then later, during pre-training, distributed_pretraining_dataset.py searches for valid-* files rather than test-* files (line 164). There are no such files and therefore, training crashes...

    Thanks a lot! David

    opened by peerdavid 6
  • How to combine wiki and bookcorpus into one file?

    How to combine wiki and bookcorpus into one file?

    I found that in the dataset description, we can Use process_data.py for pre-processing wikipedia/bookcorpus datasets into a single text file.

    What if I want to process these two datasets at the same time? At which step should I combine them? Thanks!

    opened by shizhediao 4
  • bert_model not used

    bert_model not used

    Hi I noticed some inconsistencies in the create_pretraining_data script and wanted to make sure they don't generate any further issues,

    1. The script data/create_pretraining_data.py has an unused argument bert_model. Should it be used for anything?
    2. And BertTokenizer is instantiated with max_len 512. Shouldn't it rather be with max_seq_length?

    Best!

    opened by senisioi 2
  • Clarification on

    Clarification on "sparse token prediction" or "sparse output prediction"

    First off, thanks for the great paper and making your code publicly available, it's been really valuable.

    I was hoping to clarify what you meant by "sparse token prediction" (in Section 3.1, under Software) and "sparse output prediction" (in Table 2). For both instances, you cite the original RoBERTa paper, however I can't find any mention of sparsity in their paper.

    In your model args, you define sparse mask prediction as predicting only masked tokens: https://github.com/IntelLabs/academic-budget-bert/blob/ea000838156e3be251699ad6a3c8b1339c76e987/pretraining/args/model_args.py#L105

    Is this what is meant in the paper? If so, what is the time saved reported in Table 2 relative to? If not, what do you mean by sparse token/output prediction?

    Thank you!

    opened by mirandrom 2
  • Which kind of optimization you use from DeepSpeed

    Which kind of optimization you use from DeepSpeed

    Hi,

    First and foremost. Thank you for your excellent work.

    Could you please elaborate more on what DeepSpeed functionality you used? Zero-1 or Zero-2 optimization, for example?

    opened by jzhang38 1
  • Question: Easiest way to load deepspeed checkpoints as standard PyTorch models?

    Question: Easiest way to load deepspeed checkpoints as standard PyTorch models?

    Hello, Thank you for this codebase.

    When I try to load a finetune checkpoint with BertLMHeadModel.from_pretrained(model_path), it fails to load the weights on file. I get warnings like:

    Some weights of the model checkpoint at training_out/random_init_large/run_1/pretraining_experiment-/epoch1000000_step63491/ were not used when initializing BertLMHeadModel: ['bert.encoder.FinalLayerNorm.weight', 'bert.encoder.FinalLayerNorm.bias', 'bert.encoder.layer.0.PreAttentionLayerNorm.weight', 'bert.encoder.layer.0.PreAttentionLayerNorm.bias', 'bert.encoder.layer.0.PostAttentionLayerNorm.weight', 'bert.encoder.layer.0.PostAttentionLayerNorm.bias', 'bert.encoder.layer.0.intermediate.dense_act.dense.weight', 'bert.encoder.layer.0.intermediate.dense_act.dense.bias', 'bert.encoder.layer.1.PreAttentionLayerNorm.weight', 'bert.encoder.layer.1.PreAttentionLayerNorm.bias', 'bert.encoder.layer.1.PostAttentionLayerNorm.weight', 'bert.encoder.layer.1.PostAttentionLayerNorm.bias', 'bert.encoder.layer.1.intermediate.dense_act.dense.weight', 'bert.encoder.layer.1.intermediate.dense_act.dense.bias', 'bert.encoder.layer.2.PreAttentionLayerNorm.weight', 'bert.encoder.layer.2.PreAttentionLayerNorm.bias', 'bert.encoder.layer.2.PostAttentionLayerNorm.weight', 'bert.encoder.layer.2.PostAttentionLayerNorm.bias', 'bert.encoder.layer.2.intermediate.dense_act.dense.weight', 'bert.encoder.layer.2.intermediate.dense_act.dense.bias', 'bert.encoder.layer.3.PreAttentionLayerNorm.weight', 'bert.encoder.layer.3.PreAttentionLayerNorm.bias', 'bert.encoder.layer.3.PostAttentionLayerNorm.weight', 'bert.encoder.layer.3.PostAttentionLayerNorm.bias', 'bert.encoder.layer.3.intermediate.dense_act.dense.weight', 'bert.encoder.layer.3.intermediate.dense_act.dense.bias', 'bert.encoder.layer.4.PreAttentionLayerNorm.weight', 'bert.encoder.layer.4.PreAttentionLayerNorm.bias', 'bert.encoder.layer.4.PostAttentionLayerNorm.weight', 'bert.encoder.layer.4.PostAttentionLayerNorm.bias', 'bert.encoder.layer.4.intermediate.dense_act.dense.weight', 'bert.encoder.layer.4.intermediate.dense_act.dense.bias', 'bert.encoder.layer.5.PreAttentionLayerNorm.weight', 'bert.encoder.layer.5.PreAttentionLayerNorm.bias', 'bert.encoder.layer.5.PostAttentionLayerNorm.weight', 'bert.encoder.layer.5.PostAttentionLayerNorm.bias', 'bert.encoder.layer.5.intermediate.dense_act.dense.weight', 'bert.encoder.layer.5.intermediate.dense_act.dense.bias', 'bert.encoder.layer.6.PreAttentionLayerNorm.weight', 'bert.encoder.layer.6.PreAttentionLayerNorm.bias', 'bert.encoder.layer.6.PostAttentionLayerNorm.weight', 'bert.encoder.layer.6.PostAttentionLayerNorm.bias', 'bert.encoder.layer.6.intermediate.dense_act.dense.weight', 'bert.encoder.layer.6.intermediate.dense_act.dense.bias', 'bert.encoder.layer.7.PreAttentionLayerNorm.weight', 'bert.encoder.layer.7.PreAttentionLayerNorm.bias', 'bert.encoder.layer.7.PostAttentionLayerNorm.weight', 'bert.encoder.layer.7.PostAttentionLayerNorm.bias', 'bert.encoder.layer.7.intermediate.dense_act.dense.weight', 'bert.encoder.layer.7.intermediate.dense_act.dense.bias', 'bert.encoder.layer.8.PreAttentionLayerNorm.weight', 'bert.encoder.layer.8.PreAttentionLayerNorm.bias', 'bert.encoder.layer.8.PostAttentionLayerNorm.weight', 'bert.encoder.layer.8.PostAttentionLayerNorm.bias', 'bert.encoder.layer.8.intermediate.dense_act.dense.weight', 'bert.encoder.layer.8.intermediate.dense_act.dense.bias', 'bert.encoder.layer.9.PreAttentionLayerNorm.weight', 'bert.encoder.layer.9.PreAttentionLayerNorm.bias', 'bert.encoder.layer.9.PostAttentionLayerNorm.weight', 'bert.encoder.layer.9.PostAttentionLayerNorm.bias', 'bert.encoder.layer.9.intermediate.dense_act.dense.weight', 'bert.encoder.layer.9.intermediate.dense_act.dense.bias', 'bert.encoder.layer.10.PreAttentionLayerNorm.weight', 'bert.encoder.layer.10.PreAttentionLayerNorm.bias', 'bert.encoder.layer.10.PostAttentionLayerNorm.weight', 'bert.encoder.layer.10.PostAttentionLayerNorm.bias', 'bert.encoder.layer.10.intermediate.dense_act.dense.weight', 'bert.encoder.layer.10.intermediate.dense_act.dense.bias', 'bert.encoder.layer.11.PreAttentionLayerNorm.weight', 'bert.encoder.layer.11.PreAttentionLayerNorm.bias', 'bert.encoder.layer.11.PostAttentionLayerNorm.weight', 'bert.encoder.layer.11.PostAttentionLayerNorm.bias', 'bert.encoder.layer.11.intermediate.dense_act.dense.weight', 'bert.encoder.layer.11.intermediate.dense_act.dense.bias', 'cls.predictions.transform.dense_act.dense.weight', 'cls.predictions.transform.dense_act.dense.bias'] - This IS expected if you are initializing BertLMHeadModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model). - This IS NOT expected if you are initializing BertLMHeadModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).

    I'm not familiar with deepspeed and would prefer not to use it in my downstream analysis. Is there a straightforward way to load the finetune checkpoint as a standard PyTorch model?

    Thank you for your help!

    opened by QuintinPope 1
  • Which versions for pre-training?

    Which versions for pre-training?

    First of all: thanks for your work! I am trying to run the pre-training script. I keep having compatbility issues while installing dependencies. I would be interested to know which of these you used:

    1. Nvidia driver version
    2. Pytorch version + its CUDA version
    3. CUDA version
    4. CuDNN version
    5. Python version
    6. OS
    opened by marcelbra 1
  • Fixed run_glue script

    Fixed run_glue script

    Fixes #10 the run_glue script. The problem was not that the functions were not imported from the local dataset directory as assumed in #9

    The problem was that the datasets package was not listed in the requirements.txt

    I was able to run the script using this setup.

    opened by Rotendahl 1
  • Fix: Finetune checkpoints are not created

    Fix: Finetune checkpoints are not created

    Finetune checkpoints are not created, because the function save_weights_ckpt is called which is named save_weights in BasePretrainModel. This PR fixes the issue.

    opened by peerdavid 1
  • The file produced by process_data.py is empty

    The file produced by process_data.py is empty

    Thanks for your awesome work and detailed README! However, when I perform preprocessing with process_data.py, the output directory and file wiki_one_article_per_line.txt is empty. I think the input file of process_data.py is in the right format, like what is mentioned in wikiextractor:

    <doc id="" url="" title="">
    	    ...
    	    </doc>
    

    I'm looking forward to your early reply :)

    opened by Richar-Du 0
  • the eval_acc on RTE dataset is only 55%

    the eval_acc on RTE dataset is only 55%

    Hello, thank you for your code. I tired to run your code with the following commond: aim=pretraining_experiment-bert-mlm--23000 deepspeed --include=localhost:0,1,2,3,4,5,6,7 --master_port 64000 run_pretraining.py
    --model_type bert-mlm --tokenizer_name bert-base-uncased
    --hidden_act gelu
    --hidden_size 1024
    --num_hidden_layers 24
    --num_attention_heads 16
    --intermediate_size 4096
    --hidden_dropout_prob 0.1
    --attention_probs_dropout_prob 0.1
    --encoder_ln_mode pre-ln
    --lr 1e-3
    --train_batch_size 4096
    --train_micro_batch_size_per_gpu 128
    --lr_schedule step
    --curve linear
    --warmup_proportion 0.06
    --gradient_clipping 0.0
    --optimizer_type adamw
    --weight_decay 0.01
    --adam_beta1 0.9
    --adam_beta2 0.98
    --adam_eps 1e-6
    --total_training_time 24.0
    --early_exit_time_marker 24.0
    --dataset_path path_to_dataset
    --output_dir path_to_output
    --print_steps 100
    --num_epochs_between_checkpoints 10000
    --job_name ${aim}
    --project_name budget-bert-pretraining
    --validation_epochs 3
    --validation_epochs_begin 1
    --validation_epochs_end 1
    --validation_begin_proportion 0.05
    --validation_end_proportion 0.01
    --validation_micro_batch 16
    --deepspeed
    --data_loader_type dist
    --do_validation
    --use_early_stopping
    --early_stop_time 180
    --early_stop_eval_loss 6
    --seed 42
    --fp16
    --max_steps 23000
    --finetune_checkpoint_at_end

    I did not change your code. But the eval_acc on RTE is only 55%, which is significantly lower than bert-baseline (~65%). Could you give some advices?

    opened by leoozy 1
  • The training process will get stuck after training for one epoch

    The training process will get stuck after training for one epoch

    Hi, @peteriz seems like there is an issue if deleting the line global_rank = 0. With different worker reading different shard, the total num of iteration for each worker in an epoch is different. So at the end of an epoch, it has synchronization issue and gets stuck. With global_bank=0 the issue disappeared, since the torch data sampler gives each worker the same amount of data. But this has the issue as @sangmichaelxie described, it will skip every 8 files to read.

    Originally posted by @Xinpeng-Wang in https://github.com/IntelLabs/academic-budget-bert/issues/22#issuecomment-1173159490

    As @Xinpeng-Wang said, after you fix the bug by deleting global_bank=0, the code will get stuch after one epoch. Could you please help me to solve the problem?

    opened by leoozy 10
  • Finetuning commands for other glue tasks

    Finetuning commands for other glue tasks

    HI, Can you share what finetuning commands you used for other glue tasks? Did you use the same warmup, hyperparameters etc as for the example MRPC command you shared?

    opened by raghavlite 1
  • What is the size of the processed data?

    What is the size of the processed data?

    Hello, I processed the wikipedia and bookcorpors using your scripts. The total size of the processed wikipedia dataset is around 106G (~2650 hdf5 files). Could you please tell me whether it is right?

    opened by leoozy 1
  • Distributed pretraining dataset question

    Distributed pretraining dataset question

    https://github.com/IntelLabs/academic-budget-bert/blob/ea000838156e3be251699ad6a3c8b1339c76e987/pretraining/dataset/distributed_pretraining_dataset.py#L280

    In the above line, the global_rank is set to 0 for all workers, meaning that the function will return the same file_index for all the workers. If world_size = 8, then it seems like this code is reading every 8th file and skipping the files in between. Can you explain why this is done? Thanks.

    opened by sangmichaelxie 3
Owner
Intel Labs
Intel Labs
Deploy a ML inference service on a budget in less than 10 lines of code.

BudgetML is perfect for practitioners who would like to quickly deploy their models to an endpoint, but not waste a lot of time, money, and effort trying to figure out how to do this end-to-end.

null 1.3k Dec 25, 2022
PyTorch implementation of "A Simple Baseline for Low-Budget Active Learning".

A Simple Baseline for Low-Budget Active Learning This repository is the implementation of A Simple Baseline for Low-Budget Active Learning. In this pa

null 10 Nov 14, 2022
BERT model training impelmentation using 1024 A100 GPUs for MLPerf Training v1.1

Pre-trained checkpoint and bert config json file Location of checkpoint and bert config json file This MLCommons members Google Drive location contain

SAIT (Samsung Advanced Institute of Technology) 12 Apr 27, 2022
I-BERT: Integer-only BERT Quantization

I-BERT: Integer-only BERT Quantization HuggingFace Implementation I-BERT is also available in the master branch of HuggingFace! Visit the following li

Sehoon Kim 139 Dec 27, 2022
Source code for NAACL 2021 paper "TR-BERT: Dynamic Token Reduction for Accelerating BERT Inference"

TR-BERT Source code and dataset for "TR-BERT: Dynamic Token Reduction for Accelerating BERT Inference". The code is based on huggaface's transformers.

THUNLP 37 Oct 30, 2022
LV-BERT: Exploiting Layer Variety for BERT (Findings of ACL 2021)

LV-BERT Introduction In this repo, we introduce LV-BERT by exploiting layer variety for BERT. For detailed description and experimental results, pleas

Weihao Yu 14 Aug 24, 2022
VD-BERT: A Unified Vision and Dialog Transformer with BERT

VD-BERT: A Unified Vision and Dialog Transformer with BERT PyTorch Code for the following paper at EMNLP2020: Title: VD-BERT: A Unified Vision and Dia

Salesforce 44 Nov 1, 2022
Pre-trained BERT Models for Ancient and Medieval Greek, and associated code for LaTeCH 2021 paper titled - "A Pilot Study for BERT Language Modelling and Morphological Analysis for Ancient and Medieval Greek"

Ancient Greek BERT The first and only available Ancient Greek sub-word BERT model! State-of-the-art post fine-tuning on Part-of-Speech Tagging and Mor

Pranaydeep Singh 22 Dec 8, 2022
Rendering Point Clouds with Compute Shaders

Compute Shader Based Point Cloud Rendering This repository contains the source code to our techreport: Rendering Point Clouds with Compute Shaders and

Markus Schütz 460 Jan 5, 2023
Details about the wide minima density hypothesis and metrics to compute width of a minima

wide-minima-density-hypothesis Details about the wide minima density hypothesis and metrics to compute width of a minima This repo presents the wide m

Nikhil Iyer 9 Dec 27, 2022
Compute FID scores with PyTorch.

FID score for PyTorch This is a port of the official implementation of Fréchet Inception Distance to PyTorch. See https://github.com/bioinf-jku/TTUR f

null 2.1k Jan 6, 2023
Compute descriptors for 3D point cloud registration using a multi scale sparse voxel architecture

MS-SVConv : 3D Point Cloud Registration with Multi-Scale Architecture and Self-supervised Fine-tuning Compute features for 3D point cloud registration

null 42 Jul 25, 2022
A fast model to compute optical flow between two input images.

DCVNet: Dilated Cost Volumes for Fast Optical Flow This repository contains our implementation of the paper: @InProceedings{jiang2021dcvnet, title={

Huaizu Jiang 8 Sep 27, 2021
General purpose GPU compute framework for cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends)

General purpose GPU compute framework for cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). Blazing fast, mobile-enabled, asynchronous and optimized for advanced GPU data processing usecases. Backed by the Linux Foundation.

The Kompute Project 1k Jan 6, 2023
Fast algorithms to compute an approximation of the minimal volume oriented bounding box of a point cloud in 3D.

ApproxMVBB Status Build UnitTests Homepage Fast algorithms to compute an approximation of the minimal volume oriented bounding box of a point cloud in

Gabriel Nützi 390 Dec 31, 2022
Compute execution plan: A DAG representation of work that you want to get done. Individual nodes of the DAG could be simple python or shell tasks or complex deeply nested parallel branches or embedded DAGs themselves.

Hello from magnus Magnus provides four capabilities for data teams: Compute execution plan: A DAG representation of work that you want to get done. In

null 12 Feb 8, 2022
Learning recognition/segmentation models without end-to-end training. 40%-60% less GPU memory footprint. Same training time. Better performance.

InfoPro-Pytorch The Information Propagation algorithm for training deep networks with local supervision. (ICLR 2021) Revisiting Locally Supervised Lea

null 78 Dec 27, 2022
Code for pre-training CharacterBERT models (as well as BERT models).

Pre-training CharacterBERT (and BERT) This is a repository for pre-training BERT and CharacterBERT. DISCLAIMER: The code was largely adapted from an o

Hicham EL BOUKKOURI 31 Dec 5, 2022
(CVPR2021) Kaleido-BERT: Vision-Language Pre-training on Fashion Domain

Kaleido-BERT: Vision-Language Pre-training on Fashion Domain Mingchen Zhuge*, Dehong Gao*, Deng-Ping Fan#, Linbo Jin, Ben Chen, Haoming Zhou, Minghui

null 248 Dec 4, 2022