Hi, I have setup the conda environment and ran the scripts, i.e. running firstly
python -m scripts.download_model
--model allenai/dsp_roberta_base_dapt_cs_tapt_citation_intent_1688
--serialization_dir $(pwd)/pretrained_models/dsp_roberta_base_dapt_cs_tapt_citation_intent_1688
and then
python -m scripts.train
--config training_config/classifier.jsonnet
--serialization_dir model_logs/citation-intent-dapt-dapt
--hyperparameters ROBERTA_CLASSIFIER_SMALL
--dataset citation_intent
--model $(pwd)/pretrained_models/dsp_roberta_base_dapt_cs_tapt_citation_intent_1688
--device 0
--perf +f1
--evaluate_on_test
but I am getting this error now:
/home/mikeleatila/anaconda3/envs/domains/bin/python /home/mikeleatila/dont_stop_pretraining_master/scripts/train.py --config training_config/classifier.jsonnet --serialization_dir model_logs/citation-intent-dapt-dapt --hyperparameters ROBERTA_CLASSIFIER_SMALL --dataset citation_intent --model /home/mikeleatila/dont_stop_pretraining_master/pretrained_models/dsp_roberta_base_dapt_cs_tapt_citation_intent_1688 --device 0 --perf +f1 --evaluate_on_test
2022-11-24 09:59:10,204 - INFO - transformers.file_utils - PyTorch version 1.13.0 available.
2022-11-24 09:59:10,816 - INFO - pytorch_pretrained_bert.modeling - Better speed can be achieved with apex installed from https://www.github.com/nvidia/apex .
Traceback (most recent call last):
File "/home/mikeleatila/anaconda3/envs/domains/bin/allennlp", line 8, in
sys.exit(run())
File "/home/mikeleatila/anaconda3/envs/domains/lib/python3.7/site-packages/allennlp/run.py", line 18, in run
main(prog="allennlp")
File "/home/mikeleatila/anaconda3/envs/domains/lib/python3.7/site-packages/allennlp/commands/init.py", line 93, in main
args.func(args)
File "/home/mikeleatila/anaconda3/envs/domains/lib/python3.7/site-packages/allennlp/commands/train.py", line 144, in train_model_from_args
dry_run=args.dry_run,
File "/home/mikeleatila/anaconda3/envs/domains/lib/python3.7/site-packages/allennlp/commands/train.py", line 203, in train_model_from_file
dry_run=dry_run,
File "/home/mikeleatila/anaconda3/envs/domains/lib/python3.7/site-packages/allennlp/commands/train.py", line 266, in train_model
dry_run=dry_run,
File "/home/mikeleatila/anaconda3/envs/domains/lib/python3.7/site-packages/allennlp/commands/train.py", line 450, in _train_worker
batch_weight_key=batch_weight_key,
File "/home/mikeleatila/anaconda3/envs/domains/lib/python3.7/site-packages/allennlp/common/from_params.py", line 555, in from_params
**extras,
File "/home/mikeleatila/anaconda3/envs/domains/lib/python3.7/site-packages/allennlp/common/from_params.py", line 583, in from_params
kwargs = create_kwargs(constructor_to_inspect, cls, params, **extras)
File "/home/mikeleatila/anaconda3/envs/domains/lib/python3.7/site-packages/allennlp/common/from_params.py", line 188, in create_kwargs
cls.name, param_name, annotation, param.default, params, **extras
File "/home/mikeleatila/anaconda3/envs/domains/lib/python3.7/site-packages/allennlp/common/from_params.py", line 294, in pop_and_construct_arg
return construct_arg(class_name, name, popped_params, annotation, default, **extras)
File "/home/mikeleatila/anaconda3/envs/domains/lib/python3.7/site-packages/allennlp/common/from_params.py", line 329, in construct_arg
return annotation.from_params(params=popped_params, **subextras)
File "/home/mikeleatila/anaconda3/envs/domains/lib/python3.7/site-packages/allennlp/common/from_params.py", line 555, in from_params
**extras,
File "/home/mikeleatila/anaconda3/envs/domains/lib/python3.7/site-packages/allennlp/common/from_params.py", line 583, in from_params
kwargs = create_kwargs(constructor_to_inspect, cls, params, **extras)
File "/home/mikeleatila/anaconda3/envs/domains/lib/python3.7/site-packages/allennlp/common/from_params.py", line 188, in create_kwargs
cls.name, param_name, annotation, param.default, params, **extras
File "/home/mikeleatila/anaconda3/envs/domains/lib/python3.7/site-packages/allennlp/common/from_params.py", line 294, in pop_and_construct_arg
return construct_arg(class_name, name, popped_params, annotation, default, **extras)
File "/home/mikeleatila/anaconda3/envs/domains/lib/python3.7/site-packages/allennlp/common/from_params.py", line 372, in construct_arg
**extras,
File "/home/mikeleatila/anaconda3/envs/domains/lib/python3.7/site-packages/allennlp/common/from_params.py", line 329, in construct_arg
return annotation.from_params(params=popped_params, **subextras)
File "/home/mikeleatila/anaconda3/envs/domains/lib/python3.7/site-packages/allennlp/common/from_params.py", line 555, in from_params
**extras,
File "/home/mikeleatila/anaconda3/envs/domains/lib/python3.7/site-packages/allennlp/common/from_params.py", line 583, in from_params
kwargs = create_kwargs(constructor_to_inspect, cls, params, **extras)
File "/home/mikeleatila/anaconda3/envs/domains/lib/python3.7/site-packages/allennlp/common/from_params.py", line 199, in create_kwargs
params.assert_empty(cls.name)
File "/home/mikeleatila/anaconda3/envs/domains/lib/python3.7/site-packages/allennlp/common/params.py", line 421, in assert_empty
"Extra parameters passed to {}: {}".format(class_name, self.params)
allennlp.common.checks.ConfigurationError: Extra parameters passed to PretrainedTransformerIndexer: {'do_lowercase': False}
2022-11-24 09:59:10,850 - INFO - allennlp.common.params - random_seed = 58860
2022-11-24 09:59:10,850 - INFO - allennlp.common.params - numpy_seed = 58860
2022-11-24 09:59:10,850 - INFO - allennlp.common.params - pytorch_seed = 58860
2022-11-24 09:59:10,851 - INFO - allennlp.common.checks - Pytorch version: 1.13.0
2022-11-24 09:59:10,851 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.commands.train.TrainModel'> from params {'validation_data_path': 'https://s3-us-west-2.amazonaws.com/allennlp/dont_stop_pretraining/data/citation_intent/dev.jsonl', 'evaluate_on_test': True, 'model': {'dropout': '0.1', 'feedforward_layer': {'activations': 'tanh', 'hidden_dims': 768, 'input_dim': 768, 'num_layers': 1}, 'seq2vec_encoder': {'embedding_dim': 768, 'type': 'cls_pooler_x'}, 'text_field_embedder': {'roberta': {'model_name': '/home/mikeleatila/dont_stop_pretraining_master/pretrained_models/dsp_roberta_base_dapt_cs_tapt_citation_intent_1688', 'type': 'pretrained_transformer'}}, 'type': 'basic_classifier_with_f1'}, 'iterator': {'batch_size': 16, 'sorting_keys': [['tokens', 'num_tokens']], 'type': 'bucket'}, 'validation_iterator': {'batch_size': 64, 'sorting_keys': [['tokens', 'num_tokens']], 'type': 'bucket'}, 'train_data_path': 'https://s3-us-west-2.amazonaws.com/allennlp/dont_stop_pretraining/data/citation_intent/train.jsonl', 'test_data_path': 'https://s3-us-west-2.amazonaws.com/allennlp/dont_stop_pretraining/data/citation_intent/test.jsonl', 'trainer': {'cuda_device': 0, 'gradient_accumulation_batch_size': 16, 'num_epochs': 10, 'num_serialized_models_to_keep': 0, 'optimizer': {'b1': 0.9, 'b2': 0.98, 'e': 1e-06, 'lr': '2e-05', 'max_grad_norm': 1, 'parameter_groups': [[['bias', 'LayerNorm.bias', 'LayerNorm.weight', 'layer_norm.weight'], {'weight_decay': 0}, []]], 'schedule': 'warmup_linear', 't_total': -1, 'type': 'bert_adam', 'warmup': 0.06, 'weight_decay': 0.1}, 'patience': 3, 'validation_metric': '+f1'}, 'validation_dataset_reader': {'lazy': False, 'max_sequence_length': 512, 'token_indexers': {'roberta': {'do_lowercase': False, 'model_name': '/home/mikeleatila/dont_stop_pretraining_master/pretrained_models/dsp_roberta_base_dapt_cs_tapt_citation_intent_1688', 'type': 'pretrained_transformer'}}, 'tokenizer': {'do_lowercase': False, 'end_tokens': [''], 'model_name': '/home/mikeleatila/dont_stop_pretraining_master/pretrained_models/dsp_roberta_base_dapt_cs_tapt_citation_intent_1688', 'start_tokens': [''], 'type': 'pretrained_transformer'}, 'type': 'text_classification_json_with_sampling'}, 'dataset_reader': {'lazy': False, 'max_sequence_length': 512, 'token_indexers': {'roberta': {'do_lowercase': False, 'model_name': '/home/mikeleatila/dont_stop_pretraining_master/pretrained_models/dsp_roberta_base_dapt_cs_tapt_citation_intent_1688', 'type': 'pretrained_transformer'}}, 'tokenizer': {'do_lowercase': False, 'end_tokens': [''], 'model_name': '/home/mikeleatila/dont_stop_pretraining_master/pretrained_models/dsp_roberta_base_dapt_cs_tapt_citation_intent_1688', 'start_tokens': [''], 'type': 'pretrained_transformer'}, 'type': 'text_classification_json_with_sampling'}} and extras {'batch_weight_key', 'local_rank', 'serialization_dir'}
2022-11-24 09:59:10,851 - INFO - allennlp.common.params - type = default
2022-11-24 09:59:10,851 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.commands.train.TrainModel'> from params {'validation_data_path': 'https://s3-us-west-2.amazonaws.com/allennlp/dont_stop_pretraining/data/citation_intent/dev.jsonl', 'evaluate_on_test': True, 'model': {'dropout': '0.1', 'feedforward_layer': {'activations': 'tanh', 'hidden_dims': 768, 'input_dim': 768, 'num_layers': 1}, 'seq2vec_encoder': {'embedding_dim': 768, 'type': 'cls_pooler_x'}, 'text_field_embedder': {'roberta': {'model_name': '/home/mikeleatila/dont_stop_pretraining_master/pretrained_models/dsp_roberta_base_dapt_cs_tapt_citation_intent_1688', 'type': 'pretrained_transformer'}}, 'type': 'basic_classifier_with_f1'}, 'iterator': {'batch_size': 16, 'sorting_keys': [['tokens', 'num_tokens']], 'type': 'bucket'}, 'validation_iterator': {'batch_size': 64, 'sorting_keys': [['tokens', 'num_tokens']], 'type': 'bucket'}, 'train_data_path': 'https://s3-us-west-2.amazonaws.com/allennlp/dont_stop_pretraining/data/citation_intent/train.jsonl', 'test_data_path': 'https://s3-us-west-2.amazonaws.com/allennlp/dont_stop_pretraining/data/citation_intent/test.jsonl', 'trainer': {'cuda_device': 0, 'gradient_accumulation_batch_size': 16, 'num_epochs': 10, 'num_serialized_models_to_keep': 0, 'optimizer': {'b1': 0.9, 'b2': 0.98, 'e': 1e-06, 'lr': '2e-05', 'max_grad_norm': 1, 'parameter_groups': [[['bias', 'LayerNorm.bias', 'LayerNorm.weight', 'layer_norm.weight'], {'weight_decay': 0}, []]], 'schedule': 'warmup_linear', 't_total': -1, 'type': 'bert_adam', 'warmup': 0.06, 'weight_decay': 0.1}, 'patience': 3, 'validation_metric': '+f1'}, 'validation_dataset_reader': {'lazy': False, 'max_sequence_length': 512, 'token_indexers': {'roberta': {'do_lowercase': False, 'model_name': '/home/mikeleatila/dont_stop_pretraining_master/pretrained_models/dsp_roberta_base_dapt_cs_tapt_citation_intent_1688', 'type': 'pretrained_transformer'}}, 'tokenizer': {'do_lowercase': False, 'end_tokens': [''], 'model_name': '/home/mikeleatila/dont_stop_pretraining_master/pretrained_models/dsp_roberta_base_dapt_cs_tapt_citation_intent_1688', 'start_tokens': [''], 'type': 'pretrained_transformer'}, 'type': 'text_classification_json_with_sampling'}, 'dataset_reader': {'lazy': False, 'max_sequence_length': 512, 'token_indexers': {'roberta': {'do_lowercase': False, 'model_name': '/home/mikeleatila/dont_stop_pretraining_master/pretrained_models/dsp_roberta_base_dapt_cs_tapt_citation_intent_1688', 'type': 'pretrained_transformer'}}, 'tokenizer': {'do_lowercase': False, 'end_tokens': [''], 'model_name': '/home/mikeleatila/dont_stop_pretraining_master/pretrained_models/dsp_roberta_base_dapt_cs_tapt_citation_intent_1688', 'start_tokens': [''], 'type': 'pretrained_transformer'}, 'type': 'text_classification_json_with_sampling'}} and extras {'batch_weight_key', 'local_rank', 'serialization_dir'}
2022-11-24 09:59:10,851 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.dataset_readers.dataset_reader.DatasetReader'> from params {'lazy': False, 'max_sequence_length': 512, 'token_indexers': {'roberta': {'do_lowercase': False, 'model_name': '/home/mikeleatila/dont_stop_pretraining_master/pretrained_models/dsp_roberta_base_dapt_cs_tapt_citation_intent_1688', 'type': 'pretrained_transformer'}}, 'tokenizer': {'do_lowercase': False, 'end_tokens': [''], 'model_name': '/home/mikeleatila/dont_stop_pretraining_master/pretrained_models/dsp_roberta_base_dapt_cs_tapt_citation_intent_1688', 'start_tokens': [''], 'type': 'pretrained_transformer'}, 'type': 'text_classification_json_with_sampling'} and extras {'batch_weight_key', 'local_rank', 'serialization_dir'}
2022-11-24 09:59:10,851 - INFO - allennlp.common.params - dataset_reader.type = text_classification_json_with_sampling
2022-11-24 09:59:10,851 - INFO - allennlp.common.from_params - instantiating class <class 'dont_stop_pretraining.data.dataset_readers.text_classification_json_reader_with_sampling.TextClassificationJsonReaderWithSampling'> from params {'lazy': False, 'max_sequence_length': 512, 'token_indexers': {'roberta': {'do_lowercase': False, 'model_name': '/home/mikeleatila/dont_stop_pretraining_master/pretrained_models/dsp_roberta_base_dapt_cs_tapt_citation_intent_1688', 'type': 'pretrained_transformer'}}, 'tokenizer': {'do_lowercase': False, 'end_tokens': [''], 'model_name': '/home/mikeleatila/dont_stop_pretraining_master/pretrained_models/dsp_roberta_base_dapt_cs_tapt_citation_intent_1688', 'start_tokens': [''], 'type': 'pretrained_transformer'}} and extras {'batch_weight_key', 'local_rank', 'serialization_dir'}
2022-11-24 09:59:10,851 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.token_indexers.token_indexer.TokenIndexer'> from params {'do_lowercase': False, 'model_name': '/home/mikeleatila/dont_stop_pretraining_master/pretrained_models/dsp_roberta_base_dapt_cs_tapt_citation_intent_1688', 'type': 'pretrained_transformer'} and extras {'batch_weight_key', 'local_rank', 'serialization_dir'}
2022-11-24 09:59:10,852 - INFO - allennlp.common.params - dataset_reader.token_indexers.roberta.type = pretrained_transformer
2022-11-24 09:59:10,852 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.token_indexers.pretrained_transformer_indexer.PretrainedTransformerIndexer'> from params {'do_lowercase': False, 'model_name': '/home/mikeleatila/dont_stop_pretraining_master/pretrained_models/dsp_roberta_base_dapt_cs_tapt_citation_intent_1688'} and extras {'batch_weight_key', 'local_rank', 'serialization_dir'}
2022-11-24 09:59:10,852 - INFO - allennlp.common.params - dataset_reader.token_indexers.roberta.token_min_padding_length = 0
2022-11-24 09:59:10,852 - INFO - allennlp.common.params - dataset_reader.token_indexers.roberta.model_name = /home/mikeleatila/dont_stop_pretraining_master/pretrained_models/dsp_roberta_base_dapt_cs_tapt_citation_intent_1688
2022-11-24 09:59:10,852 - INFO - allennlp.common.params - dataset_reader.token_indexers.roberta.namespace = tags
2022-11-24 09:59:10,852 - INFO - allennlp.common.params - dataset_reader.token_indexers.roberta.max_length = None
Traceback (most recent call last):
File "/home/mikeleatila/dont_stop_pretraining_master/scripts/train.py", line 142, in
main()
File "/home/mikeleatila/dont_stop_pretraining_master/scripts/train.py", line 139, in main
subprocess.run(" ".join(allennlp_command), shell=True, check=True)
File "/home/mikeleatila/anaconda3/envs/domains/lib/python3.7/subprocess.py", line 512, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command 'allennlp train --include-package dont_stop_pretraining training_config/classifier.jsonnet -s model_logs/citation-intent-dapt-dapt' returned non-zero exit status @1.
Many thanks in advance!