i met this error when run ./scripts/zest_grouped_bart_large_from_trained.sh
Epoch 0: 0% 0/538 [00:00<?, ?it/s]
Traceback (most recent call last):
File "cli_grouped.py", line 142, in
main()
File "cli_grouped.py", line 139, in main
run(args, logger)
File "/content/drive/MyDrive/hypter/run_grouped.py", line 87, in run
train(args, logger, model, train_data, dev_data, optimizer, scheduler)
File "/content/drive/MyDrive/hypter/run_grouped.py", line 154, in train
is_training=True)
File "/content/drive/MyDrive/hypter/growing_bart.py", line 119, in forward
is_training=is_training
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/content/drive/MyDrive/hypter/bart_with_adapter.py", line 298, in forward
use_cache=use_cache,
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/transformers/modeling_bart.py", line 835, in forward
encoder_outputs = self.encoder(input_ids=input_ids, attention_mask=attention_mask)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/transformers/modeling_bart.py", line 309, in forward
x, attn = encoder_layer(x, attention_mask)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/content/drive/MyDrive/hypter/bart_with_adapter.py", line 138, in forward
query=x, key=x, key_padding_mask=encoder_padding_mask, need_weights=self.output_attentions
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/transformers/modeling_bart.py", line 646, in forward
attn_weights = attn_weights.masked_fill(reshaped, float("-inf"))
RuntimeError: CUDA out of memory. Tried to allocate 256.00 MiB (GPU 0; 14.76 GiB total capacity; 13.24 GiB already allocated; 81.75 MiB free; 13.29 GiB reserved in total by PyTorch)
Is model too large for forwarding by gpu in collab?