First I want to thank the authors for this great work! I might find it useful for my research.
I encountered 3 problems:
- in
evaluator/dataset_evaluator.py
, in the usage of hf_hub_download
, I got an exception because it was used that way: hf_hub_download(repo_id="datasets/tau/scrolls", filename="metrics/scrolls.py")
instead of hf_hub_download(repo_id="tau/scrolls", filename="metrics/scrolls.py", repo_type="dataset")
. I don't know why it worked for you, perhaps there was a breaking change in the datasets library recently. Would you want me to open a PR for that?
- The generate script (
python scripts/execute.py scripts/commands/generate.py {dataset}_{model}_{split} --checkpoint_path path/to/model/folder
) took a very long time, much more than the fine-tuning of 256-bart. There was a warning that might be related saying:
Was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector
Edit: I now noticed that this warning is emitted only when I use more than one GPU. However, it is still slower than expected.
3. It failed with the following exception:
Traceback (most recent call last):
File "/home/liranringel/scrolls/baselines/scripts/execute.py", line 53, in
main(command_dict, unknown)
File "/home/liranringel/scrolls/baselines/scripts/execute.py", line 33, in main
runpy.run_module(module_name, run_name="main")
File "/home/liranringel/miniconda3/envs/mem/lib/python3.9/runpy.py", line 228, in run_module
return _run_code(code, {}, init_globals, run_name, mod_spec)
File "/home/liranringel/miniconda3/envs/mem/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/liranringel/scrolls/baselines/src/run.py", line 789, in
main()
File "/home/liranringel/scrolls/baselines/src/run.py", line 656, in main
metrics = trainer.evaluate(metric_key_prefix="eval")
File "/home/liranringel/miniconda3/envs/mem/lib/python3.9/site-packages/transformers/trainer_seq2seq.py", line 131, in evaluate
eval_preds = self._post_process_function(untokenized_eval_dataset, eval_loop_output.predictions)
File "/home/liranringel/miniconda3/envs/mem/lib/python3.9/site-packages/transformers/trainer_seq2seq.py", line 326, in _post_process_function
assert len(untokenized_eval_dataset) == len(self.eval_dataset)
AssertionError