Hello,
I am executing the training script, and got the following output. I am following the instructions exactly as described in the READ.ME.
(tvlnbert) rehmanm@kw60805:/data1/vln-bert$ python -m torch.distributed.launch --nproc_per_node=2 --nnodes=1 --node_rank=0 train.py --from_pretrained data/models/vilbert_pytorch_model_9.bin --save_name pre_train_run_id --num_epochs 50 --warmup_proportion 0.08 --cooldown_factor 8 --masked_language --masked_vision --no_ranking
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
02/15/2021 18:43:27 - INFO - pytorch_pretrained_bert.tokenization - loading vocabulary file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at /home/rehmanm/.pytorch_pretrained_bert/26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
02/15/2021 18:43:27 - INFO - pytorch_pretrained_bert.tokenization - loading vocabulary file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at /home/rehmanm/.pytorch_pretrained_bert/26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
02/15/2021 18:43:27 - INFO - main - using provided training trajectories
02/15/2021 18:44:49 - WARNING - utils.dataset.beam_dataset - skipping index: 603 in beam data in from path: data/beamsearch/beams_val_seen.json
02/15/2021 18:44:49 - WARNING - utils.dataset.beam_dataset - skipping index: 661 in beam data in from path: data/beamsearch/beams_val_seen.json
02/15/2021 18:44:49 - WARNING - utils.dataset.beam_dataset - skipping index: 1016 in beam data in from path: data/beamsearch/beams_val_seen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 32 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 76 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 121 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 268 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 285 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 298 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 301 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 306 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 361 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 384 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 439 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 487 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 537 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 547 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 791 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 813 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 909 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 918 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 929 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 974 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 1128 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 1204 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 1245 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 1281 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 1312 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 1391 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 1430 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 1587 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 1651 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 1769 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 1775 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 1857 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 1860 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 1906 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 1928 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 1996 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 2061 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 2070 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 2085 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 2128 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 2204 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 2226 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 2235 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 2261 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - WARNING - utils.dataset.beam_dataset - skipping index: 2329 in beam data in from path: data/beamsearch/beams_val_unseen.json
02/15/2021 18:44:57 - INFO - main - batch_size: 4
02/15/2021 18:44:57 - INFO - vilbert.vilbert - loading archive file data/models/vilbert_pytorch_model_9.bin
02/15/2021 18:44:57 - INFO - vilbert.vilbert - Model config {
"attention_probs_dropout_prob": 0.1,
"bi_attention_type": 1,
"bi_hidden_size": 1024,
"bi_intermediate_size": 1024,
"bi_num_attention_heads": 8,
"fast_mode": false,
"fixed_t_layer": 0,
"fixed_v_layer": 0,
"fusion_method": "mul",
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 768,
"in_batch_pairs": false,
"initializer_range": 0.02,
"intermediate_size": 3072,
"intra_gate": false,
"max_position_embeddings": 512,
"num_attention_heads": 12,
"num_hidden_layers": 12,
"pooling_method": "mul",
"predict_feature": false,
"t_biattention_id": [
6,
7,
8,
9,
10,
11
],
"type_vocab_size": 2,
"v_attention_probs_dropout_prob": 0.1,
"v_biattention_id": [
0,
1,
2,
3,
4,
5
],
"v_feature_size": 2048,
"v_hidden_act": "gelu",
"v_hidden_dropout_prob": 0.1,
"v_hidden_size": 1024,
"v_initializer_range": 0.02,
"v_intermediate_size": 1024,
"v_num_attention_heads": 8,
"v_num_hidden_layers": 6,
"v_target_size": 1601,
"vocab_size": 30522,
"with_coattention": true
}
02/15/2021 18:45:02 - INFO - vilbert.vilbert - Weights of VLNBert not initialized from pretrained model: ['bert.v_embeddings.image_orientation_embeddings.weight', 'bert.v_embeddings.image_orientation_embeddings.bias', 'bert.v_embeddings.image_next_orientation_embeddings.weight', 'bert.v_embeddings.image_next_orientation_embeddings.bias', 'bert.v_embeddings.image_sequence_embeddings.weight', 'vil_logit.weight', 'vil_logit.bias']
02/15/2021 18:45:02 - INFO - main - number of parameters: 250,086,014
02/15/2021 18:45:04 - INFO - main - using distributed data parallel
02/15/2021 18:45:04 - INFO - main - starting training...
/home/rehmanm/anaconda3/envs/tvlnbert/lib/python3.8/site-packages/torch/optim/lr_scheduler.py:231: UserWarning: To get the last learning rate computed by the scheduler, please use get_last_lr()
.
warnings.warn("To get the last learning rate computed by the scheduler, "
/data1/vln-bert/vilbert/optimization.py:166: UserWarning: This overload of add_ is deprecated:
add_(Number alpha, Tensor other)
Consider using one of the following signatures instead:
add_(Tensor other, *, Number alpha) (Triggered internally at /opt/conda/conda-bld/pytorch_1595629411241/work/torch/csrc/utils/python_arg_parser.cpp:766.)
exp_avg.mul_(beta1).add_(1.0 - beta1, grad)
/data1/vln-bert/vilbert/optimization.py:166: UserWarning: This overload of add_ is deprecated:
add_(Number alpha, Tensor other)
Consider using one of the following signatures instead:
add_(Tensor other, *, Number alpha) (Triggered internally at /opt/conda/conda-bld/pytorch_1595629411241/work/torch/csrc/utils/python_arg_parser.cpp:766.)
exp_avg.mul_(beta1).add_(1.0 - beta1, grad)
I would be grateful if you could let me know if this is expected or if I need to change something to fix this error.