Code for ICLR 2020 paper "VL-BERT: Pre-training of Generic Visual-Linguistic Representations".

Overview

VL-BERT

By Weijie Su, Xizhou Zhu, Yue Cao, Bin Li, Lewei Lu, Furu Wei, Jifeng Dai.

This repository is an official implementation of the paper VL-BERT: Pre-training of Generic Visual-Linguistic Representations.

Update on 2020/01/16 Add code of visualization.

Update on 2019/12/20 Our VL-BERT got accepted by ICLR 2020.

Introduction

VL-BERT is a simple yet powerful pre-trainable generic representation for visual-linguistic tasks. It is pre-trained on the massive-scale caption dataset and text-only corpus, and can be fine-tuned for various down-stream visual-linguistic tasks, such as Visual Commonsense Reasoning, Visual Question Answering and Referring Expression Comprehension.

Thanks to PyTorch and its 3rd-party libraries, this codebase also contains following features:

  • Distributed Training
  • FP16 Mixed-Precision Training
  • Various Optimizers and Learning Rate Schedulers
  • Gradient Accumulation
  • Monitoring the Training Using TensorboardX

Citing VL-BERT

@inproceedings{
  Su2020VL-BERT:,
  title={VL-BERT: Pre-training of Generic Visual-Linguistic Representations},
  author={Weijie Su and Xizhou Zhu and Yue Cao and Bin Li and Lewei Lu and Furu Wei and Jifeng Dai},
  booktitle={International Conference on Learning Representations},
  year={2020},
  url={https://openreview.net/forum?id=SygXPaEYvH}
}

Prepare

Environment

  • Ubuntu 16.04, CUDA 9.0, GCC 4.9.4
  • Python 3.6.x
    # We recommend you to use Anaconda/Miniconda to create a conda environment
    conda create -n vl-bert python=3.6 pip
    conda activate vl-bert
  • PyTorch 1.0.0 or 1.1.0
    conda install pytorch=1.1.0 cudatoolkit=9.0 -c pytorch
  • Apex (optional, for speed-up and fp16 training)
    git clone https://github.com/jackroos/apex
    cd ./apex
    pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./  
  • Other requirements:
    pip install Cython
    pip install -r requirements.txt
  • Compile
    ./scripts/init.sh

Data

See PREPARE_DATA.md.

Pre-trained Models

See PREPARE_PRETRAINED_MODELS.md.

Training

Distributed Training on Single-Machine

./scripts/dist_run_single.sh <num_gpus> <task>/train_end2end.py <path_to_cfg> <dir_to_store_checkpoint>
  • <num_gpus>: number of gpus to use.
  • <task>: pretrain/vcr/vqa/refcoco.
  • <path_to_cfg>: config yaml file under ./cfgs/<task>.
  • <dir_to_store_checkpoint>: root directory to store checkpoints.

Following is a more concrete example:

./scripts/dist_run_single.sh 4 vcr/train_end2end.py ./cfgs/vcr/base_q2a_4x16G_fp32.yaml ./

Distributed Training on Multi-Machine

For example, on 2 machines (A and B), each with 4 GPUs,

run following command on machine A:

./scripts/dist_run_multi.sh 2 0 <ip_addr_of_A> 4 <task>/train_end2end.py <path_to_cfg> <dir_to_store_checkpoint>

run following command on machine B:

./scripts/dist_run_multi.sh 2 1 <ip_addr_of_A> 4 <task>/train_end2end.py <path_to_cfg> <dir_to_store_checkpoint>

Non-Distributed Training

./scripts/nondist_run.sh <task>/train_end2end.py <path_to_cfg> <dir_to_store_checkpoint>

Note:

  1. In yaml files under ./cfgs, we set batch size for GPUs with at least 16G memory, you may need to adapt the batch size and gradient accumulation steps according to your actual case, e.g., if you decrease the batch size, you should also increase the gradient accumulation steps accordingly to keep 'actual' batch size for SGD unchanged.

  2. For efficiency, we recommend you to use distributed training even on single-machine. But for RefCOCO+, you may meet deadlock using distributed training due to unknown reason (it may be related to PyTorch dataloader deadloack), you can simply use non-distributed training to solve this problem.

Evaluation

VCR

  • Local evaluation on val set:

    python vcr/val.py \
      --a-cfg <cfg_of_q2a> --r-cfg <cfg_of_qa2r> \
      --a-ckpt <checkpoint_of_q2a> --r-ckpt <checkpoint_of_qa2r> \
      --gpus <indexes_of_gpus_to_use> \
      --result-path <dir_to_save_result> --result-name <result_file_name>
    

    Note: <indexes_of_gpus_to_use> is gpu indexes, e.g., 0 1 2 3.

  • Generate prediction results on test set for leaderboard submission:

    python vcr/test.py \
      --a-cfg <cfg_of_q2a> --r-cfg <cfg_of_qa2r> \
      --a-ckpt <checkpoint_of_q2a> --r-ckpt <checkpoint_of_qa2r> \
      --gpus <indexes_of_gpus_to_use> \
      --result-path <dir_to_save_result> --result-name <result_file_name>
    

VQA

  • Generate prediction results on test set for EvalAI submission:
    python vqa/test.py \
      --cfg <cfg_file> \
      --ckpt <checkpoint> \
      --gpus <indexes_of_gpus_to_use> \
      --result-path <dir_to_save_result> --result-name <result_file_name>
    

RefCOCO+

  • Local evaluation on val/testA/testB set:
    python refcoco/test.py \
      --split <val|testA|testB> \
      --cfg <cfg_file> \
      --ckpt <checkpoint> \
      --gpus <indexes_of_gpus_to_use> \
      --result-path <dir_to_save_result> --result-name <result_file_name>
    

Visualization

See VISUALIZATION.md.

Acknowledgements

Many thanks to following codes that help us a lot in building this codebase:

Comments
  • i have 'ImportError: /home/ailab/VL-BERT/vcr/../common/lib/roi_pooling/C_ROIPooling.cpython-36m-x86_64-linux-gnu.so: undefined symbol: __cudaPopCallConfiguration' problem!

    i have 'ImportError: /home/ailab/VL-BERT/vcr/../common/lib/roi_pooling/C_ROIPooling.cpython-36m-x86_64-linux-gnu.so: undefined symbol: __cudaPopCallConfiguration' problem!

    cuda 9.0 pytorch 1.1.0 ubuntu 16.04

    when running the command "./scripts/nondist_run.sh vcr/train_end2end.py ./cfgs/vcr/base_q2a_4x16G_fp32.yaml ./" i have a problem. i find this problem caused of mismatch in the cuda version installed on the system and cuda version working in my environment. but my cuda version installed on the system is 9.0 and cuda version working in my environment is 9.0.. how do i do? help T^T

    Traceback (most recent call last):
      File "vcr/train_end2end.py", line 8, in <module>
        from vcr.function.train import train_net
      File "/home/ailab/VL-BERT/vcr/../vcr/function/train.py", line 26, in <module>
        from vcr.modules import *
      File "/home/ailab/VL-BERT/vcr/../vcr/modules/__init__.py", line 1, in <module>
        from .resnet_vlbert_for_vcr import ResNetVLBERT
      File "/home/ailab/VL-BERT/vcr/../vcr/modules/resnet_vlbert_for_vcr.py", line 7, in <module>
        from common.fast_rcnn import FastRCNN
      File "/home/ailab/VL-BERT/vcr/../common/fast_rcnn.py", line 10, in <module>
        from common.lib.roi_pooling.roi_pool import ROIPool
      File "/home/ailab/VL-BERT/vcr/../common/lib/roi_pooling/__init__.py", line 1, in <module>
        from .roi_align import ROIAlign
      File "/home/ailab/VL-BERT/vcr/../common/lib/roi_pooling/roi_align.py", line 8, in <module>
        from . import C_ROIPooling
    ImportError: /home/ailab/VL-BERT/vcr/../common/lib/roi_pooling/C_ROIPooling.cpython-36m-x86_64-linux-gnu.so: undefined symbol: __cudaPopCallConfiguration
    
    opened by jaeyun95 12
  • undefined symbol:  _ZN6caffe26detail37_typeMetaDataInstance_preallocated_32E

    undefined symbol: _ZN6caffe26detail37_typeMetaDataInstance_preallocated_32E

    when running the command "./scripts/dist_run_single.sh 4 vcr/train_end2end.py ./cfgs/vcr/base_q2a_4x16G_fp32.yaml ./"

    I got this error

    ImportError: /data/workspace/VL-BERT/vcr/../common/lib/roi_pooling/C_ROIPooling.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN6caffe26detail37_typeMetaDataInstance_preallocated_32E Traceback (most recent call last): File "./scripts/launch.py", line 200, in main() File "./scripts/launch.py", line 196, in main cmd=process.args) subprocess.CalledProcessError: Command '['/root/anaconda3/envs/vl-bert/bin/python', '-u', 'vcr/train_end2end.py', '--cfg', './cfgs/vcr/base_q2a_4x16G_fp32.yaml', '--model-dir', './', '--dist']' returned non-zero exit status 1. (vl-bert) root@6cbe3808c2b8:/data/workspace/VL-BERT# Traceback (most recent call last): File "vcr/train_end2end.py", line 8, in from vcr.function.train import train_net File "/data/workspace/VL-BERT/vcr/../vcr/function/train.py", line 26, in from vcr.modules import * File "/data/workspace/VL-BERT/vcr/../vcr/modules/init.py", line 1, in from .resnet_vlbert_for_vcr import ResNetVLBERT File "/data/workspace/VL-BERT/vcr/../vcr/modules/resnet_vlbert_for_vcr.py", line 7, in from common.fast_rcnn import FastRCNN File "/data/workspace/VL-BERT/vcr/../common/fast_rcnn.py", line 10, in from common.lib.roi_pooling.roi_pool import ROIPool File "/data/workspace/VL-BERT/vcr/../common/lib/roi_pooling/init.py", line 1, in from .roi_align import ROIAlign File "/data/workspace/VL-BERT/vcr/../common/lib/roi_pooling/roi_align.py", line 8, in from . import C_ROIPooling ImportError: /data/workspace/VL-BERT/vcr/../common/lib/roi_pooling/C_ROIPooling.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN6caffe26detail37_typeMetaDataInstance_preallocated_32E Traceback (most recent call last): File "vcr/train_end2end.py", line 8, in from vcr.function.train import train_net File "/data/workspace/VL-BERT/vcr/../vcr/function/train.py", line 26, in from vcr.modules import * File "/data/workspace/VL-BERT/vcr/../vcr/modules/init.py", line 1, in from .resnet_vlbert_for_vcr import ResNetVLBERT File "/data/workspace/VL-BERT/vcr/../vcr/modules/resnet_vlbert_for_vcr.py", line 7, in from common.fast_rcnn import FastRCNN File "/data/workspace/VL-BERT/vcr/../common/fast_rcnn.py", line 10, in from common.lib.roi_pooling.roi_pool import ROIPool File "/data/workspace/VL-BERT/vcr/../common/lib/roi_pooling/init.py", line 1, in from .roi_align import ROIAlign File "/data/workspace/VL-BERT/vcr/../common/lib/roi_pooling/roi_align.py", line 8, in from . import C_ROIPooling


    I think it may be caused by ROI_pooling.xx.so

    But I've compiled C_ROIPooling.cpython-36m-x86_64-linux-gnu.so with "./scripts/init.sh"

    opened by search-opensource-space 12
  • attention_viz

    attention_viz

    Dear contributor,

    Fantastic work on VL task!

    Now I have some trouble showing the visualization results between language and ROIs. Could you please hit me your code to generate the result as attention_viz.png showed if possible?

    Many thanks.

    opened by detectiveli 11
  • Which config file should I use when doing pre-training and fine-tuning on each task to reproduce the paper results?

    Which config file should I use when doing pre-training and fine-tuning on each task to reproduce the paper results?

    Hi. I have noticed that there are several config files in cfg/pretrain, cfg/vqa and cfg/refcoco (like there are 3 base-model configs base_e2e_16x16G_fp16.yaml, base_prec_4x16G_fp32.yaml, base_prec_withouttextonly_4x16G_fp32.yaml existing in cfg/pretrain) Can you provide more details about the differences of these configs? If I want to reproduce the paper results, which configs among them should I use? Thank you!

    opened by yangapku 9
  • _pickle.UnpicklingError: invalid load key, '-'.

    _pickle.UnpicklingError: invalid load key, '-'.

    Thanks for your great code. I try to train on the vcr task to see result. when i did python vcr/val.py \ --a-cfg ./cfgs/vcr/base_q2a_4x16G_fp32.yaml --r-cfg ./cfgs/vcr/base_qa2r_4x16G_fp32.yaml \ --a-ckpt ./output/base_q2a_4x16G_fp32.yaml --r-ckpt ./output/base_qa2r_4x16G_fp32.yaml \ --gpus 0 1 \ --result-path ./results/ --result-name eval_vcr, the mistake happened. As follows: warnings.warn('miss keys: {}'.format(miss_keys)) Warnings: Unexpected keys: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.LayerNorm.gamma', 'cls.predictions.transform.LayerNorm.beta', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias']. Traceback (most recent call last): File "vcr/val.py", line 214, in <module> main() File "/home/songzijie/.conda/envs/vlbert/lib/python3.6/site-packages/torch/autograd/grad_mode.py", line 43, in decorate_no_grad return func(*args, **kwargs) File "vcr/val.py", line 114, in main a_ckpt = torch.load(args.a_ckpt, map_location=lambda storage, loc: storage) File "/home/songzijie/.conda/envs/vlbert/lib/python3.6/site-packages/torch/serialization.py", line 387, in load return _load(f, map_location, pickle_module, **pickle_load_args) File "/home/songzijie/.conda/envs/vlbert/lib/python3.6/site-packages/torch/serialization.py", line 564, in _load magic_number = pickle_module.load(f, **pickle_load_args) _pickle.UnpicklingError: invalid load key, '-'. I hope can get some help to solve the problem. Thanks a lot.

    opened by liulijie-2020 7
  • share logs of pretrain

    share logs of pretrain

    Excellent work! I used the script and https://github.com/jackroos/VL-BERT/blob/master/cfgs/pretrain/base_e2e_16x16G_fp16.yaml to pretrain,but can't be sure if the loss and accuracy are good enough. When using my pre-trained model to finetune in VCR task, I cannot get the high score of using the shared pre-trained model. I will appreciate for your share of pre-trained logs. Thanks.

    opened by itsliupeng 7
  • Unrecognized tensor type ID: AutogradCUDA

    Unrecognized tensor type ID: AutogradCUDA

    When I was running the following script for fine-tuning on refcoco, ! bash ./scripts/nondist_run.sh refcoco/train_end2end.py 'cfgs/refcoco/base_gt_boxes_4x16G.yaml' refcoco_base_gt_ckpt

    I enountered the following error.

    [Partial Load] non matched keys: ['object_mask_word_embedding.weight', 'aux_text_visual_embedding.weight', 'vlbert.mlm_head.predictions.bias', 'vlbert.mlm_head.predictions.transform.dense.weight', 'vlbert.mlm_head.predictions.transform.dense.bias', 'vlbert.mlm_head.predictions.transform.LayerNorm.weight', 'vlbert.mlm_head.predictions.transform.LayerNorm.bias', 'vlbert.mlm_head.predictions.decoder.weight', 'vlbert.mvrc_head.region_cls_pred.weight', 'vlbert.mvrc_head.region_cls_pred.bias'] [Partial Load] non pretrain keys: ['final_mlp.2.weight', 'final_mlp.2.bias'] PROGRESS: 0.00% /content/gdrive/My Drive/DDP/VL-BERT/refcoco/../common/fast_rcnn.py:136: UserWarning: This overload of nonzero is deprecated: nonzero() Consider using one of the following signatures instead: nonzero(*, bool as_tuple) (Triggered internally at /pytorch/torch/csrc/utils/python_arg_parser.cpp:882.) box_inds = box_mask.nonzero() Traceback (most recent call last): File "refcoco/train_end2end.py", line 60, in main() File "refcoco/train_end2end.py", line 54, in main rank, model = train_net(args, config) File "/content/gdrive/My Drive/DDP/VL-BERT/refcoco/../refcoco/function/train.py", line 323, in train_net gradient_accumulate_steps=config.TRAIN.GRAD_ACCUMULATE_STEPS) File "/content/gdrive/My Drive/DDP/VL-BERT/refcoco/../common/trainer.py", line 115, in train outputs, loss = net(*batch) File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/content/gdrive/My Drive/DDP/VL-BERT/refcoco/../common/module.py", line 22, in forward return self.train_forward(*inputs, **kwargs) File "/content/gdrive/My Drive/DDP/VL-BERT/refcoco/../refcoco/modules/resnet_vlbert_for_refcoco.py", line 96, in train_forward segms=None) File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/content/gdrive/My Drive/DDP/VL-BERT/refcoco/../common/fast_rcnn.py", line 149, in forward roi_align_res = self.roi_align(img_feats['body4'], rois).type(images.dtype) File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/content/gdrive/My Drive/DDP/VL-BERT/refcoco/../common/lib/roi_pooling/roi_align.py", line 69, in forward input.float(), rois.float(), self.output_size, self.spatial_scale, self.sampling_ratio File "/content/gdrive/My Drive/DDP/VL-BERT/refcoco/../common/lib/roi_pooling/roi_align.py", line 20, in forward input, rois, spatial_scale, output_size[0], output_size[1], sampling_ratio RuntimeError: Unrecognized tensor type ID: AutogradCUDA

    I am running on google colab, using pytorch version 1.7.0 , torchvision 0.8.1 and cuda 10.1. Same error is coming with cuda 9.2 also. When I use pytorch 1.1.0 as mentioned in the readme, many errors related to modules in torchvision are coming. Please help I am in urgent need of this to complete my project

    opened by jkkishore1999 6
  • Some little errors in preparation of conceptual-captions dataset

    Some little errors in preparation of conceptual-captions dataset

    When I wanted to extract img feature from conceptual-captions following the instructions, I found some errors during inference with this file and it took me sometime to debug and I'd like to share with you:

    • Firstly, the calling script should be:
    python ./tools/generate_tsv_v2.py --gpu 0,1,2,3,4,5,6,7 --cfg experiments/cfgs/faster_rcnn_end2end_resnet.yml --def models/vg/ResNet-101/faster_rcnn_end2end_final/test.prototxt --net data/faster_rcnn_models/resnet101_faster_rcnn_final.caffemodel --split conceptual_captions_train --data_root {Conceptual_Captions_Root} --out {Conceptual_Captions_Root}/train_frcnn/
    

    A little mistake in --def params

    with open(os.path.join(data_root, 'utils/train.json')) as f:  #Line46
    with open(os.path.join(data_root, 'utils/val.json')) as f:  #Line54
    
    • Thirdly, in Line67, should be:
    zip_image = ziphelper.imread(str("/".join(data_root, im_file)))
    
    • Fourthly, in Line142, we need another params:
    def generate_tsv(gpu_id, prototxt, weights, image_ids, data_root, outfolder):
    

    And correspondly, in Line170, the same problem:

    json.dump(get_detections_from_im(net, im_file, image_id, ziphelper, data_root), f)
    

    Really thank you for your sharing of the code!

    opened by weiyx16 6
  • Is the refcoco-trained model available?

    Is the refcoco-trained model available?

    Hello,

    It seems that only generic pre-trained models are available, and we have to fine-tune the model on the specific task, right? I am wondering if the refcoco-trained model is available for evaluation?

    Thanks, st2yang

    opened by st2yang 5
  • Why cannot import name 'C_ROIPooling' from 'common.lib.roi_pooling'

    Why cannot import name 'C_ROIPooling' from 'common.lib.roi_pooling'

    I meet this problem: ImportError: cannot import name 'C_ROIPooling' from 'common.lib.roi_pooling' (/home/wenhao/project/VL-BERT-master/pretrain/../common/lib/roi_pooling/init.py) And I don't know how to solve.plz.

    opened by wenhao7841 5
  • Could only download 400k images

    Could only download 400k images

    I am trying to use your script to download conceptual captions dataset. I was able to download only 400k images from the training set, instead of the 3M images in the training set. I have run your script 5 times to download the images which might be coming from unreliable servers. Apparently, there are a lot of images for which the links do not seem to work anymore. If you have the images downloaded somewhere, would it be possible for you to share the dataset?

    opened by gsrivas4 5
  • Bump protobuf from 3.10.0 to 3.18.3

    Bump protobuf from 3.10.0 to 3.18.3

    Bumps protobuf from 3.10.0 to 3.18.3.

    Release notes

    Sourced from protobuf's releases.

    Protocol Buffers v3.18.3

    C++

    Protocol Buffers v3.16.1

    Java

    • Improve performance characteristics of UnknownFieldSet parsing (#9371)

    Protocol Buffers v3.18.2

    Java

    • Improve performance characteristics of UnknownFieldSet parsing (#9371)

    Protocol Buffers v3.18.1

    Python

    • Update setup.py to reflect that we now require at least Python 3.5 (#8989)
    • Performance fix for DynamicMessage: force GetRaw() to be inlined (#9023)

    Ruby

    • Update ruby_generator.cc to allow proto2 imports in proto3 (#9003)

    Protocol Buffers v3.18.0

    C++

    • Fix warnings raised by clang 11 (#8664)
    • Make StringPiece constructible from std::string_view (#8707)
    • Add missing capability attributes for LLVM 12 (#8714)
    • Stop using std::iterator (deprecated in C++17). (#8741)
    • Move field_access_listener from libprotobuf-lite to libprotobuf (#8775)
    • Fix #7047 Safely handle setlocale (#8735)
    • Remove deprecated version of SetTotalBytesLimit() (#8794)
    • Support arena allocation of google::protobuf::AnyMetadata (#8758)
    • Fix undefined symbol error around SharedCtor() (#8827)
    • Fix default value of enum(int) in json_util with proto2 (#8835)
    • Better Smaller ByteSizeLong
    • Introduce event filters for inject_field_listener_events
    • Reduce memory usage of DescriptorPool
    • For lazy fields copy serialized form when allowed.
    • Re-introduce the InlinedStringField class
    • v2 access listener
    • Reduce padding in the proto's ExtensionRegistry map.
    • GetExtension performance optimizations
    • Make tracker a static variable rather than call static functions
    • Support extensions in field access listener
    • Annotate MergeFrom for field access listener
    • Fix incomplete types for field access listener
    • Add map_entry/new_map_entry to SpecificField in MessageDifferencer. They record the map items which are different in MessageDifferencer's reporter.
    • Reduce binary size due to fieldless proto messages
    • TextFormat: ParseInfoTree supports getting field end location in addition to start.

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • script for downloading GCC images

    script for downloading GCC images

    Hi, Thank you for providing a script for downloading the GCC images. I am using your script for downloading a large set of images similar to GCC. However, the script freezes every 30min-1hr and I have to manually stop the process and run the script again. Because of the freezing it took 2 days to process only 300k urls out of a list of 1.6M so far. Is there any fix to avoid freezing? Thank you for any help in advance.

    opened by sesmae 0
  • google drive

    google drive

    hello, jackroos 1:I want to know if google drive files can be download by command gdown, and I try "gdown https://drive.google.com/u/0/uc?export=download" for the bert model file, it does not work. 2Or wget -c ...... It can not work. And I try to use the command from others-- wget --load-cookies /tmp/cookies.txt "https://drive.google.com/uc?export=download&confirm=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate 'https://drive.google.com/uc?export=download&id=${fileid}' -O- | sed -rn 's/.confirm=([0-9A-Za-z_]+)./\1\n/p')&id=${fileid}" -O ${filename} && rm -rf /tmp/cookies.txt I

    opened by zhangdabusy 0
  • Improper RefCOCO evaluation

    Improper RefCOCO evaluation

    When comparing MAttNet in Table 3, the target classification accuracy should be used. While computing IOU to approximate the classification accuracy, a small threshold (https://github.com/jackroos/VL-BERT/blob/master/refcoco/function/test.py#L17) was used, which could count false positive as correct.

    opened by st2yang 0
Owner
Weijie Su
Graduate student at USTC.
Weijie Su
A supplementary code for Editable Neural Networks, an ICLR 2020 submission.

Editable neural networks A supplementary code for Editable Neural Networks, an ICLR 2020 submission by Anton Sinitsin, Vsevolod Plokhotnyuk, Dmitry Py

Anton Sinitsin 32 Nov 29, 2022
Implementation of "Selection via Proxy: Efficient Data Selection for Deep Learning" from ICLR 2020.

Selection via Proxy: Efficient Data Selection for Deep Learning This repository contains a refactored implementation of "Selection via Proxy: Efficien

Stanford Future Data Systems 70 Nov 16, 2022
PyTorch code for ICLR 2021 paper Unbiased Teacher for Semi-Supervised Object Detection

Unbiased Teacher for Semi-Supervised Object Detection This is the PyTorch implementation of our paper: Unbiased Teacher for Semi-Supervised Object Detection

Facebook Research 366 Dec 28, 2022
Code for ICLR 2021 Paper, "Anytime Sampling for Autoregressive Models via Ordered Autoencoding"

Anytime Autoregressive Model Anytime Sampling for Autoregressive Models via Ordered Autoencoding , ICLR 21 Yilun Xu, Yang Song, Sahaj Gara, Linyuan Go

Yilun Xu 22 Sep 8, 2022
Code for the paper "Training GANs with Stronger Augmentations via Contrastive Discriminator" (ICLR 2021)

Training GANs with Stronger Augmentations via Contrastive Discriminator (ICLR 2021) This repository contains the code for reproducing the paper: Train

Jongheon Jeong 174 Dec 29, 2022
Official code for the ICLR 2021 paper Neural ODE Processes

Neural ODE Processes Official code for the paper Neural ODE Processes (ICLR 2021). Abstract Neural Ordinary Differential Equations (NODEs) use a neura

Cristian Bodnar 50 Oct 28, 2022
Source code, datasets and trained models for the paper Learning Advanced Mathematical Computations from Examples (ICLR 2021), by François Charton, Amaury Hayat (ENPC-Rutgers) and Guillaume Lample

Maths from examples - Learning advanced mathematical computations from examples This is the source code and data sets relevant to the paper Learning a

Facebook Research 171 Nov 23, 2022
code for the ICLR'22 paper: On Robust Prefix-Tuning for Text Classification

On Robust Prefix-Tuning for Text Classification Prefix-tuning has drawed much attention as it is a parameter-efficient and modular alternative to adap

Zonghan Yang 12 Nov 30, 2022
UDP++ (ECCVW 2020 Oral), (Winner of COCO 2020 Keypoint Challenge).

UDP-Pose This is the pytorch implementation for UDP++, which won the Fisrt place in COCO Keypoint Challenge at ECCV 2020 Workshop. Top-Down Results on

null 20 Jul 29, 2022
Code for paper ECCV 2020 paper: Who Left the Dogs Out? 3D Animal Reconstruction with Expectation Maximization in the Loop.

Who Left the Dogs Out? Evaluation and demo code for our ECCV 2020 paper: Who Left the Dogs Out? 3D Animal Reconstruction with Expectation Maximization

Benjamin Biggs 29 Dec 28, 2022
TensorFlow code for the neural network presented in the paper: "Structural Language Models of Code" (ICML'2020)

SLM: Structural Language Models of Code This is an official implementation of the model described in: "Structural Language Models of Code" [PDF] To ap

null 73 Nov 6, 2022
Official implementation of the ICLR 2021 paper

You Only Need Adversarial Supervision for Semantic Image Synthesis Official PyTorch implementation of the ICLR 2021 paper "You Only Need Adversarial S

Bosch Research 272 Dec 28, 2022
Seach Losses of our paper 'Loss Function Discovery for Object Detection via Convergence-Simulation Driven Search', accepted by ICLR 2021.

CSE-Autoloss Designing proper loss functions for vision tasks has been a long-standing research direction to advance the capability of existing models

Peidong Liu(刘沛东) 54 Dec 17, 2022
This project is the official implementation of our accepted ICLR 2021 paper BiPointNet: Binary Neural Network for Point Clouds.

BiPointNet: Binary Neural Network for Point Clouds Created by Haotong Qin, Zhongang Cai, Mingyuan Zhang, Yifu Ding, Haiyu Zhao, Shuai Yi, Xianglong Li

Haotong Qin 59 Dec 17, 2022
Based on the paper "Geometry-aware Instance-reweighted Adversarial Training" ICLR 2021 oral

Geometry-aware Instance-reweighted Adversarial Training This repository provides codes for Geometry-aware Instance-reweighted Adversarial Training (ht

Jingfeng 47 Dec 22, 2022
Official repository for the ICLR 2021 paper Evaluating the Disentanglement of Deep Generative Models with Manifold Topology

Official repository for the ICLR 2021 paper Evaluating the Disentanglement of Deep Generative Models with Manifold Topology Sharon Zhou, Eric Zelikman

Stanford Machine Learning Group 34 Nov 16, 2022
Zero-shot Synthesis with Group-Supervised Learning (ICLR 2021 paper)

GSL - Zero-shot Synthesis with Group-Supervised Learning Figure: Zero-shot synthesis performance of our method with different dataset (iLab-20M, RaFD,

Andy_Ge 62 Dec 21, 2022
An implementation demo of the ICLR 2021 paper Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks in PyTorch.

Neural Attention Distillation This is an implementation demo of the ICLR 2021 paper Neural Attention Distillation: Erasing Backdoor Triggers from Deep

Yige-Li 84 Jan 4, 2023
Official Pytorch implementation of ICLR 2018 paper Deep Learning for Physical Processes: Integrating Prior Scientific Knowledge.

Deep Learning for Physical Processes: Integrating Prior Scientific Knowledge: Official Pytorch implementation of ICLR 2018 paper Deep Learning for Phy

emmanuel 47 Nov 6, 2022