🐛 Bug
Hi,
I was trying to evaluate the pre-trained models under "Efficient Wait-k Models for Simultaneous Machine Translation". For this, I followed the instructions given in the readme. Specifically, I did followings:
After downloading model and data and placing them underpre_saved
:
cd ~/attn2d/pre_saved
tar xzf iwslt14_de_en.tar.gz
tar xzf tf_waitk_model.tar.gz
k=5 # Evaluation time k
output=wait$k.log
CUDA_VISIBLE_DEVICES=0 python generate.py pre_saved/iwslt14_deen_bpe10k_binaries/ -s de -t en --gen-subset test --path pre_saved/tf_waitk_model.tar.gz --task waitk_translation --eval-waitk $k --model-overrides "{'max_source_positions': 1024, 'max_target_positions': 1024}" --left-pad-source False --user-dir examples/waitk --no-progress-bar --max-tokens 8000 --remove-bpe --beam 1 2>&1 | tee -a $output
It generates following error message:
Traceback (most recent call last):
File "generate.py", line 11, in <module>
cli_main()
File "/home/attn2d/fairseq_cli/generate.py", line 276, in cli_main
parser = options.get_generation_parser()
File "/home/attn2d/fairseq/options.py", line 33, in get_generation_parser
parser = get_parser("Generation", default_task)
File "/home/attn2d/fairseq/options.py", line 197, in get_parser
utils.import_user_module(usr_args)
File "/home/attn2d/fairseq/utils.py", line 350, in import_user_module
importlib.import_module(module_name)
File "/home/anaconda3/envs/py37/lib/python3.7/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
File "<frozen importlib._bootstrap>", line 983, in _find_and_load
File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 728, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/home/attn2d/examples/waitk/__init__.py", line 1, in <module>
from . import models, tasks
File "/home/attn2d/examples/waitk/models/__init__.py", line 7, in <module>
importlib.import_module('examples.simultaneous.models.' + model_name)
File "/home/anaconda3/envs/py37/lib/python3.7/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
ModuleNotFoundError: No module named 'examples.simultaneous'
EDIT
Okay, here is more detail about this:
I believe thisline is responsible for the error message shared above.
I changed this importlib.import_module('examples.simultaneous.models.' + model_name)
to importlib.import_module('examples.waitk.models.' + model_name)
Then, I got another error:
File "generate.py", line 11, in <module>
cli_main()
File "/home/attn2d/fairseq_cli/generate.py", line 276, in cli_main
parser = options.get_generation_parser()
File "/home/attn2d/fairseq/options.py", line 33, in get_generation_parser
parser = get_parser("Generation", default_task)
File "/home/attn2d/fairseq/options.py", line 197, in get_parser
utils.import_user_module(usr_args)
File "/home/attn2d/fairseq/utils.py", line 350, in import_user_module
importlib.import_module(module_name)
File "/home/anaconda3/lib/python3.7/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
File "<frozen importlib._bootstrap>", line 983, in _find_and_load
File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 728, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/home/attn2d/examples/waitk/__init__.py", line 1, in <module>
from . import models, tasks
File "/home/attn2d/examples/waitk/models/__init__.py", line 8, in <module>
importlib.import_module('examples.waitk.models.' + model_name)
File "/home/anaconda3/lib/python3.7/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "/home/attn2d/examples/waitk/__init__.py", line 1, in <module>
from . import models, tasks
File "/home/attn2d/examples/waitk/models/__init__.py", line 8, in <module>
importlib.import_module('examples.waitk.models.' + model_name)
File "/home/anaconda3/lib/python3.7/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "/home/attn2d/examples/waitk/models/waitk_transformer.py", line 24, in <module>
from examples.simultaneous.modules import TransformerEncoderLayer, TransformerDecoderLayer
So, I changed this line here to ```from examples.waitk.modules import TransformerEncoderLayer, ```` too. Then when I tried once more, I got the following error:
File "generate.py", line 11, in <module>
cli_main()
File "/home/attn2d/fairseq_cli/generate.py", line 276, in cli_main
parser = options.get_generation_parser()
File "/home/attn2d/fairseq/options.py", line 33, in get_generation_parser
parser = get_parser("Generation", default_task)
File "/home/attn2d/fairseq/options.py", line 197, in get_parser
utils.import_user_module(usr_args)
File "/home/attn2d/fairseq/utils.py", line 350, in import_user_module
importlib.import_module(module_name)
File "/home/anaconda3/lib/python3.7/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
File "<frozen importlib._bootstrap>", line 983, in _find_and_load
File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 728, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/home/attn2d/examples/waitk/__init__.py", line 1, in <module>
from . import models, tasks
File "/home/attn2d/examples/waitk/models/__init__.py", line 8, in <module>
importlib.import_module('examples.waitk.models.' + model_name)
File "/home/anaconda3/lib/python3.7/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "/home/attn2d/examples/waitk/__init__.py", line 1, in <module>
from . import models, tasks
File "/home/attn2d/examples/waitk/models/__init__.py", line 8, in <module>
importlib.import_module('examples.waitk.models.' + model_name)
File "/home/anaconda3/lib/python3.7/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "/home/attn2d/examples/waitk/models/waitk_transformer.py", line 25, in <module>
from examples.waitk.modules import TransformerEncoderLayer, TransformerDecoderLayer
File "/home/attn2d/examples/waitk/modules/__init__.py", line 2, in <module>
from .controller import Controller
So, to fix it I commented out following lines inexamples/waitk/modules/init.py:
from .controller import Controller
from .branch_controller import BranchController
from .oracle import SimulTransOracleDP, SimulTransOracleDP1
Next, I've tried to use the generation command given in the readme once more..
CUDA_VISIBLE_DEVICES=0 python generate.py pretrained-sources/iwslt14_deen_bpe10k_binaries/ -s de -t en --gen-subset test --path pretrained-sources/model.pt --task waitk_translation --eval-waitk $k --model-overrides "{'max_source_positions': 1024, 'max_target_positions': 1024}" --left-pad-source False --user-dir examples/waitk --no-progress-bar --max-tokens 8000 --remove-bpe --beam 1 2>&1 | tee -a $output
I got this error:
2021-09-20 20:29:46 | INFO | fairseq_cli.generate | Namespace(all_gather_list_size=16384, beam=1, bpe=None, checkpoint_suffix='', cpu=False, criterion='cross_entropy', data='pretrained-sources/iwslt14_deen_bpe10k_binaries/', data_buffer_size=0, dataset_impl=None, decoding_format=None, diverse_beam_groups=-1, diverse_beam_strength=0.5, diversity_rate=-1.0, empty_cache_freq=0, eval_bleu=False, eval_bleu_args=None, eval_bleu_detok='space', eval_bleu_detok_args=None, eval_bleu_print_samples=False, eval_bleu_remove_bpe=None, eval_tokenized_bleu=False, eval_waitk=5, force_anneal=None, fp16=False, fp16_init_scale=128, fp16_no_flatten_grads=False, fp16_scale_tolerance=0.0, fp16_scale_window=None, gen_subset='test', iter_decode_eos_penalty=0.0, iter_decode_force_max_iter=False, iter_decode_max_iter=10, iter_decode_with_beam=1, iter_decode_with_external_reranker=False, left_pad_source='False', left_pad_target='False', lenpen=1, load_alignments=False, log_format=None, log_interval=100, lr_scheduler='fixed', lr_shrink=0.1, match_source_len=False, max_len_a=0, max_len_b=200, max_sentences=None, max_source_positions=1024, max_target_positions=1024, max_tokens=8000, memory_efficient_fp16=False, min_len=1, min_loss_scale=0.0001, model_overrides="{'max_source_positions': 1024, 'max_target_positions': 1024}", model_parallel_size=1, momentum=0.99, nbest=1, no_beamable_mm=False, no_early_stop=False, no_progress_bar=True, no_repeat_ngram_size=0, num_shards=1, num_workers=1, optimizer='nag', path='pretrained-sources/model.pt', prefix_size=0, print_alignment=False, print_step=False, quantization_config_path=None, quiet=False, remove_bpe='@@ ', replace_unk=None, required_batch_size_multiple=8, results_path=None, retain_iter_history=False, sacrebleu=False, sampling=False, sampling_topk=-1, sampling_topp=-1.0, score_reference=False, seed=1, shard_id=0, skip_invalid_size_inputs_valid_test=False, source_lang='de', target_lang='en', task='waitk_translation', temperature=1.0, tensorboard_logdir='', threshold_loss_scale=None, tokenizer=None, truncate_source=False, unkpen=0, unnormalized=False, upsample_primary=1, user_dir='examples/waitk', warmup_updates=0, weight_decay=0.0)
2021-09-20 20:29:46 | INFO | fairseq.tasks.translation | [de] dictionary: 8848 types
2021-09-20 20:29:46 | INFO | fairseq.tasks.translation | [en] dictionary: 6632 types
2021-09-20 20:29:46 | INFO | fairseq.data.data_utils | loaded 6750 examples from: pretrained-sources/iwslt14_deen_bpe10k_binaries/test.de-en.de
2021-09-20 20:29:46 | INFO | fairseq.data.data_utils | loaded 6750 examples from: pretrained-sources/iwslt14_deen_bpe10k_binaries/test.de-en.en
2021-09-20 20:29:46 | INFO | fairseq.tasks.translation | pretrained-sources/iwslt14_deen_bpe10k_binaries/ test de-en 6750 examples
2021-09-20 20:29:46 | INFO | fairseq_cli.generate | loading model(s) from pretrained-sources/model.pt
Traceback (most recent call last):
File "generate.py", line 11, in <module>
cli_main()
File "/home/attn2d/fairseq_cli/generate.py", line 278, in cli_main
main(args)
File "/home/attn2d/fairseq_cli/generate.py", line 36, in main
return _main(args, sys.stdout)
File "/home/attn2d/fairseq_cli/generate.py", line 103, in _main
num_workers=args.num_workers,
File "/home/attn2d/fairseq/tasks/fairseq_task.py", line 181, in get_batch_iterator
required_batch_size_multiple=required_batch_size_multiple,
File "/home/attn2d/fairseq/data/data_utils.py", line 220, in batch_by_size
from fairseq.data.data_utils_fast import batch_by_size_fast
File "fairseq/data/data_utils_fast.pyx", line 1, in init fairseq.data.data_utils_fast
# cython: language_level=3
ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject
##########
I just gave up after that.. @elbayadm I hope you can help me on this.
Code sample
Environment
I have followed the instructions in the README to install my environment. :
git clone https://github.com/elbayadm/attn2d
cd attn2d
pip install --editable .
As a result, I have the following libraries in my environment:
Python 3.7.10 | packaged by conda-forge | (default, Feb 19 2021, 16:07:37)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
torc>>> torch.__version__
'1.9.0+cu102'
>>> import fairseq
>>> fairseq.__version__
'0.9.0'
>>>
$ python --version
Python 3.7.10
Operating system: Linux
bug