This toolkit provides codes to download and pre-process the SLUE datasets, train the baseline models, and evaluate SLUE tasks.

ASAPP Research

Last update: Sep 21, 2022

Related tags

Deep Learning slue-toolkit

Overview

slue-toolkit

We introduce Spoken Language Understanding Evaluation (SLUE) benchmark. This toolkit provides codes to download and pre-process the SLUE datasets, train the baseline models, and evaluate SLUE tasks. Refer https://arxiv.org/abs/2111.10367 for more details.

News

Nov. 22: We release the SLUE paper on arXiv along with the slue-toolkit repository. The repository contains data processing and evaluation scripts. We will publish the scripts for trainig the baseline models soon.

Installation

git clone this repository and install slue-toolkit (development mode)

git clone https://github.com/asappresearch/slue-toolkit.git
pip install -e .

or install directly from Github

pip install git+https://github.com/asappresearch/slue-toolkit.git

Install additional dependency based on your choice (e.g. you need fairseq and transformers for baselines)

SLUE Tasks

Automatic Speech Recognition (ASR)

Although this is not a SLU task, ASR can help analyze performance of downstream SLU tasks on the same domain. Additionally, pipeline approaches depend on ASR outputs, making ASR relevant to SLU. ASR is evaluated using word error rate (WER).

Named Entity Recognition (NER)

Named entity recognition involves detecting the named entities and their tags (types) in a given sentence. We evaluate performance using micro-averaged F1 and label-F1 scores. The F1 score evaluates an unordered list of named entity phrase and tag pairs predicted for each sentence. Only the tag predictions are considered for label-F1.

Sentiment Analysis (SA)

Sentiment analysis refers to classifying a given speech segment as having negative, neutral, or positive sentiment. We evaluate SA using macro-averaged (unweighted) recall and F1 scores.

Datasets

Corpus	Size - utts (hours)			Tasks	License
Corpus	Fine-tune	Dev	Test	Tasks	License
SLUE-VoxPopuli	5,000 (14.5)	1,753 (5.0)	1,842 (4.9)	ASR, NER	CC0 (check complete license here)
SLUE-VoxCeleb	5,777 (12.8)	955 (2.1)	4,052 (9.0)	ASR, SA	CC-BY 4.0 (check complete license here)

For SLUE, you need VoxCeleb and VoxPopuli dataset. We carefully curated subset of those dataset for fine-tuning and evaluation for SLUE tasks, and we re-distribute the the subsets. Thus, you don't need to download a whole gigantic datasets. In the dataset, we also includes the human annotation and transcription for SLUE tasks. All you need to do is just running the script below and it will download and pre-process the dataset.

Download and pre-process dataset

bash scripts/download_datasets.sh

SLUE score evaluation

The test set data and annotation will be used for the official SLUE score evaluation, however we will not release the test set annotation. Thus, the SLUE score can be evaluated by submitting your prediction result in tsv format. We will prepare the website to accept your submission. Please stay tuned for this.

Model development rule

To train model, You can use fine-tuning and dev sets (audio, transcription and annotation) except the test set of SLUE task. Additionally you can use any kind of external dataset whether it is labeled or unlabeled for any purpose of training (e.g. pre-training and fine-tuning).

For vadidation of your model, you can use official dev set we provide, or you can make your own splits or cross-validation splits by mixing fine-tuning and dev set all together.

Baselines

ASR

Fine-tuning

Assuming that the preprocessed manifest files are in manifest/slue-voxceleb and manifest/slue-voxpopuli for SLUE-VoxCeleb and SLUE-VoxPopuli. This command fine-tune a wav2vec 2.0 base model on these two datasets using one GPU.

bash baselines/asr/ft-w2v2-base.sh manifest/slue-voxceleb save/asr/w2v2-base-vc
bash baselines/asr/ft-w2v2-base.sh manifest/slue-voxpopuli save/asr/w2v2-base-vp

Evaluation

To evaluate the fine-tuned wav2vec 2.0 ASR models on the dev set, please run the following commands.

python slue_toolkit/eval/eval_w2v.py eval_asr save/asr/w2v2-base-vc --data manifest/slue-voxceleb --subset dev
python slue_toolkit/eval/eval_w2v.py eval_asr save/asr/w2v2-base-vp --data manifest/slue-voxpopuli --subset dev

The WER will be printed directly. The predictions are saved in save/asr/w2v2-base-vc/pred-dev.wrd and save/asr/w2v2-base-vp/pred-dev.wrd and can be used for pipeline models.

More detail baseline experiment described here

NER

Fine-tuning End-to-end model

Assuming that the preprocessed manifest files are in manifest/slue-voxpopuli for SLUE-VoxPopuli. This command fine-tune a wav2vec 2.0 base model using one GPU.

bash baselines/ner/e2e_scripts/ft-w2v2-base.sh manifest/slue-voxpopuli/e2e_ner save/e2e_ner/w2v2-base

Evaluating End-to-End model

To evaluate the fine-tuned wav2vec 2.0 E2E NER model on the dev set, please run the following command. (decoding without language model)

bash baselines/ner/e2e_scripts/eval-ner.sh w2v2-base dev combined nolm

More detail baseline experiment described here

Sentiment Analysis

Fine-tuning

This command fine-tune a wav2vec 2.0 base model on the voxceleb dataset

bash baselines/sentiment/ft-w2v2-base-senti.sh manifest/slue-voxceleb save/sentiment/w2v2-base

Evaluation

To evaluate the fine-tuned wav2vec 2.0 sentiment model, run following commands or run baselines/sentiment/e2e_scripts/eval.sh

python3 slue_toolkit/eval/eval_w2v_sentiment.py --save-dir save/sentiment/w2v2-base --data manifest/slue-voxceleb --subset dev

More detail baseline experiment described here

Comments

Fix text NER evaluation

Following the PR in https://github.com/asappresearch/slue-toolkit/pull/5. I started testing the evaluation pipeline for the text NER and it seems to also be quite flaky. I have fixed most of the errors but there are some critical ones that I mention in https://github.com/asappresearch/slue-toolkit/issues/6.

If you could make the edits to this PR for handling the remaining todos mentioned in the issue. That would be great.

I have currently kept my previous PRs also committed to this one but happy to remove them once you merge those.

Thanks Sid

opened by siddalmia 11

Sentiment Analysis baseline

Hi,

I wanted to reproduce the sentiment analysis baseline through

bash baselines/sentiment/e2e_scripts/ft-w2v2-base-senti.sh manifest/slue-voxceleb save/sentiment/w2v2-base

Fairseq Config log:

[2022-02-14 01:39:15,687][fairseq_cli.train][INFO] - {'_name': None, 'common': {'_name': None, 'no_progress_bar': False, 'log_interval': 100, 'log_format': 'json', 'log_fil[37/1798]
 'tensorboard_logdir': 'save/sentiment/w2v2-base/tb_logs', 'wandb_project': None, 'azureml_logging': False, 'seed': 1, 'cpu': False, 'tpu': False, 'bf16': False, 'memory_efficient_b
f16': False, 'fp16': True, 'memory_efficient_fp16': False, 'fp16_no_flatten_grads': False, 'fp16_init_scale': 128, 'fp16_scale_window': None, 'fp16_scale_tolerance': 0.0, 'on_cpu_c$
nvert_precision': False, 'min_loss_scale': 0.0001, 'threshold_loss_scale': None, 'amp': False, 'amp_batch_retries': 2, 'amp_init_scale': 128, 'amp_scale_window': None, 'user_dir': $
/root/pushkal/slue-toolkit/slue_toolkit/fairseq_addon', 'empty_cache_freq': 0, 'all_gather_list_size': 16384, 'model_parallel_size': 1, 'quantization_config_path': None, 'profile':
False, 'reset_logging': False, 'suppress_crashes': False, 'use_plasma_view': False, 'plasma_path': '/tmp/plasma'}, 'common_eval': {'_name': None, 'path': None, 'post_process': None$
 'quiet': False, 'model_overrides': '{}', 'results_path': None}, 'distributed_training': {'_name': None, 'distributed_world_size': 1, 'distributed_num_procs': 1, 'distributed_rank'$
 0, 'distributed_backend': 'nccl', 'distributed_init_method': None, 'distributed_port': -1, 'device_id': 0, 'distributed_no_spawn': False, 'ddp_backend': 'c10d', 'ddp_comm_hook': '$
one', 'bucket_cap_mb': 25, 'fix_batches_to_gpus': False, 'find_unused_parameters': True, 'gradient_as_bucket_view': False, 'fast_stat_sync': False, 'heartbeat_timeout': -1, 'broadc$
st_buffers': False, 'slowmo_momentum': None, 'slowmo_base_algorithm': 'localsgd', 'localsgd_frequency': 3, 'nprocs_per_node': 1, 'pipeline_model_parallel': False, 'pipeline_balance$
: None, 'pipeline_devices': None, 'pipeline_chunks': 0, 'pipeline_encoder_balance': None, 'pipeline_encoder_devices': None, 'pipeline_decoder_balance': None, 'pipeline_decoder_devi$
es': None, 'pipeline_checkpoint': 'never', 'zero_sharding': 'none', 'fp16': True, 'memory_efficient_fp16': False, 'tpu': False, 'no_reshard_after_forward': False, 'fp32_reduce_scat$
er': False, 'cpu_offload': False, 'use_sharded_state': False, 'not_fsdp_flatten_parameters': False}, 'dataset': {'_name': None, 'num_workers': 0, 'skip_invalid_size_inputs_valid_te$
t': False, 'max_tokens': 1400000, 'batch_size': None, 'required_batch_size_multiple': 8, 'required_seq_len_multiple': 1, 'dataset_impl': None, 'data_buffer_size': 10, 'train_subset$
: 'fine-tune', 'valid_subset': 'dev', 'combine_valid_subsets': None, 'ignore_unused_valid_subsets': False, 'validate_interval': 1000000, 'validate_interval_updates': 0, 'validate_a$
ter_updates': 2000, 'fixed_validation_seed': None, 'disable_validation': False, 'max_tokens_valid': 1400000, 'batch_size_valid': None, 'max_valid_steps': None, 'curriculum': 0, 'ge$
_subset': 'test', 'num_shards': 1, 'shard_id': 0, 'grouped_shuffling': False, 'update_epoch_batch_itr': False, 'update_ordered_indices_seed': False}, 'optimization': {'_name': None$
 'max_epoch': 0, 'max_update': 50000, 'stop_time_hours': 0.0, 'clip_norm': 0.0, 'sentence_avg': True, 'update_freq': [1], 'lr': [2e-05], 'stop_min_lr': -1.0, 'use_bmuf': False, 'sk$
p_remainder_batch': False}, 'checkpoint': {'_name': None, 'save_dir': 'checkpoints', 'restore_file': 'checkpoint_last.pt', 'continue_once': None, 'finetune_from_model': None, 'rese$
_dataloader': False, 'reset_lr_scheduler': False, 'reset_meters': False, 'reset_optimizer': False, 'optimizer_overrides': '{}', 'save_interval': 50, 'save_interval_updates': 1000, $
keep_interval_updates': 1, 'keep_interval_updates_pattern': -1, 'keep_last_epochs': -1, 'keep_best_checkpoints': -1, 'no_save': False, 'no_epoch_checkpoints': True, 'no_last_checkp$
ints': False, 'no_save_optimizer_state': False, 'best_checkpoint_metric': 'macro_f1', 'maximize_best_checkpoint_metric': True, 'patience': -1, 'checkpoint_suffix': '', 'checkpoint_$
hard_count': 1, 'load_checkpoint_on_all_dp_ranks': False, 'write_checkpoints_asynchronously': False, 'model_parallel_size': 1}, 'bmuf': {'_name': None, 'block_lr': 1.0, 'block_mome$
tum': 0.875, 'global_sync_iter': 50, 'warmup_iterations': 500, 'use_nbm': False, 'average_sync': False, 'distributed_world_size': 1}, 'generation': {'_name': None, 'beam': 5, 'nbes$
': 1, 'max_len_a': 0.0, 'max_len_b': 200, 'min_len': 1, 'match_source_len': False, 'unnormalized': False, 'no_early_stop': False, 'no_beamable_mm': False, 'lenpen': 1.0, 'unkpen': $
.0, 'replace_unk': None, 'sacrebleu': False, 'score_reference': False, 'prefix_size': 0, 'no_repeat_ngram_size': 0, 'sampling': False, 'sampling_topk': -1, 'sampling_topp': -1.0, '$
onstraints': None, 'temperature': 1.0, 'diverse_beam_groups': -1, 'diverse_beam_strength': 0.5, 'diversity_rate': -1.0, 'print_alignment': None, 'print_step': False, 'lm_path': Non$
, 'lm_weight': 0.0, 'iter_decode_eos_penalty': 0.0, 'iter_decode_max_iter': 10, 'iter_decode_force_max_iter': False, 'iter_decode_with_beam': 1, 'iter_decode_with_external_reranker$
: False, 'retain_iter_history': False, 'retain_dropout': False, 'retain_dropout_modules': None, 'decoding_format': None, 'no_seed_provided': False}, 'eval_lm': {'_name': None, 'out$
ut_word_probs': False, 'output_word_stats': False, 'context_window': 0, 'softmax_batch': 9223372036854775807}, 'interactive': {'_name': None, 'buffer_size': 0, 'input': '-'}, 'mode$
': {'_name': 'wav2vec2_seq_cls', 'w2v_path': '/root/pushkal/slue-toolkit/save/pretrained/wav2vec_small.pt', 'no_pretrained_weights': False, 'dropout_input': 0.0, 'final_dropout': 0$
0, 'dropout': 0.0, 'attention_dropout': 0.0, 'activation_dropout': 0.1, 'conv_feature_layers': '[(512, 10, 5)] + [(512, 3, 2)] * 4 + [(512,2,2)] + [(512,2,2)]', 'encoder_embed_dim'$
 768, 'apply_mask': True, 'mask_length': 10, 'mask_prob': 0.65, 'mask_selection': static, 'mask_other': 0.0, 'no_mask_overlap': False, 'mask_min_space': 1, 'mask_channel_length': 6$
, 'mask_channel_prob': 0.5, 'mask_channel_selection': static, 'mask_channel_other': 0.0, 'no_mask_channel_overlap': False, 'freeze_finetune_updates': 2000, 'feature_grad_mult': 0.0$
 'layerdrop': 0.1, 'mask_channel_min_space': 1, 'mask_channel_before': False, 'normalize': '${task.normalize}', 'data': '${task.data}', 'w2v_args': None, 'pool_method': 'self_attn'$
 'classifier_dropout': 0.2}, 'task': {'_name': 'slue_audio_classification', 'data': '/root/pushkal/slue-toolkit/manifest/slue-voxceleb', 'labels': 'sent', 'binarized_dataset': Fals$
, 'sample_rate': 16000, 'normalize': False, 'enable_padding': False, 'max_sample_size': None, 'min_sample_size': None, 'num_batch_buckets': 0, 'precompute_mask_indices': False, 'in$
erred_w2v_config': None, 'tpu': '${common.tpu}', 'text_compression_level': none, 'label_dir': '???'}, 'criterion': {'_name': 'slue_sequence_classification'}, 'optimizer': {'_name':
'adam', 'adam_betas': '(0.9,0.98)', 'adam_eps': 1e-08, 'weight_decay': 0.0, 'use_old_adam': False, 'fp16_adam_stats': False, 'tpu': False, 'lr': [2e-05]}, 'lr_scheduler': {'_name':
'tri_stage', 'warmup_steps': 0, 'hold_steps': 0, 'decay_steps': 0, 'phase_ratio': [0.1, 0.0, 0.9], 'init_lr_scale': 0.01, 'final_lr_scale': 0.05, 'max_update': 50000.0, 'lr': [2e-05
]}, 'scoring': None, 'bpe': None, 'tokenizer': None, 'ema': {'_name': None, 'store_ema': False, 'ema_decay': 0.9999, 'ema_start_update': 0, 'ema_seed_model': None, 'ema_update_freq'
: 1, 'ema_fp32': False}, 'job_logging_cfg': {'version': 1, 'formatters': {'simple': {'format': '[%(asctime)s][%(name)s][%(levelname)s] - %(message)s'}}, 'handlers': {'console': {'cl
ass': 'logging.StreamHandler', 'formatter': 'simple', 'stream': 'ext://sys.stdout'}, 'file': {'class': 'logging.FileHandler', 'formatter': 'simple', 'filename': 'hydra_train.log'}},
 'root': {'level': 'INFO', 'handlers': ['console', 'file']}, 'disable_existing_loggers': False}}

But facing this error:

Traceback (most recent call last):
  File "/root/miniconda3/envs/slue/bin/fairseq-hydra-train", line 33, in <module>
    sys.exit(load_entry_point('fairseq', 'console_scripts', 'fairseq-hydra-train')())
  File "/root/pushkal/slue-toolkit/deps/fairseq/fairseq_cli/hydra_train.py", line 87, in cli_main
    hydra_main()
  File "/root/miniconda3/envs/slue/lib/python3.8/site-packages/hydra/main.py", line 32, in decorated_main
    _run_hydra(
  File "/root/miniconda3/envs/slue/lib/python3.8/site-packages/hydra/_internal/utils.py", line 346, in _run_hydra
    run_and_report(
  File "/root/miniconda3/envs/slue/lib/python3.8/site-packages/hydra/_internal/utils.py", line 201, in run_and_report
    raise ex
  File "/root/miniconda3/envs/slue/lib/python3.8/site-packages/hydra/_internal/utils.py", line 198, in run_and_report
    return func()
  File "/root/miniconda3/envs/slue/lib/python3.8/site-packages/hydra/_internal/utils.py", line 347, in <lambda>
    lambda: hydra.run(
  File "/root/miniconda3/envs/slue/lib/python3.8/site-packages/hydra/_internal/hydra.py", line 107, in run
    return run_job(
  File "/root/miniconda3/envs/slue/lib/python3.8/site-packages/hydra/core/utils.py", line 129, in run_job
    ret.return_value = task_function(task_cfg)
  File "/root/pushkal/slue-toolkit/deps/fairseq/fairseq_cli/hydra_train.py", line 27, in hydra_main
    _hydra_main(cfg)
  File "/root/pushkal/slue-toolkit/deps/fairseq/fairseq_cli/hydra_train.py", line 56, in _hydra_main
    distributed_utils.call_main(cfg, pre_main, **kwargs)
  File "/root/pushkal/slue-toolkit/deps/fairseq/fairseq/distributed/utils.py", line 369, in call_main
    main(cfg, **kwargs)
  File "/root/pushkal/slue-toolkit/deps/fairseq/fairseq_cli/train.py", line 97, in main
    criterion = task.build_criterion(cfg.criterion)
  File "/root/pushkal/slue-toolkit/deps/fairseq/fairseq/tasks/fairseq_task.py", line 352, in build_criterion
    return criterions.build_criterion(cfg, self)
  File "/root/pushkal/slue-toolkit/deps/fairseq/fairseq/criterions/__init__.py", line 29, in build_criterion
    return build_criterion_(cfg, task)
  File "/root/pushkal/slue-toolkit/deps/fairseq/fairseq/registry.py", line 55, in build_x
    cls = REGISTRY[choice]
KeyError: 'slue_sequence_classification'

opened by pushkalkatara 7

Text NER baseline issues
Hi @ankitapasad @fwu-asapp @sshon-asapp,

I really apologize for my many issue creation. I thought of running the baselines again but I think there are still some issues in the slue_toolkit/text_ner/ner_deberta_modules.py file.

For example -

https://github.com/asappresearch/slue-toolkit/blob/main/slue_toolkit/text_ner/ner_deberta_modules.py#L63 - Is trying to run regex on a list. But I think they can run only on string or bytes.

https://github.com/asappresearch/slue-toolkit/blob/main/slue_toolkit/text_ner/ner_deberta_modules.py#L111 - Is trying to read a wrong file format. It should be f"{split_name}.{label_type}.tsv" instead.

Would it be possible to run a new bare slue-toolkit and see what all bugs pops up?

The command that I am trying to run is - bash baselines/ner/nlp_scripts/ft-deberta.sh deberta-base combined

Please ignore if you have already caught these issues. Thanks Sid
opened by siddalmia 6
Clean up and fix E2E NER baselines
Create the dictionary and link the tsv file (instead of copying) for NER

Remove pickle files and directly hardcode the mappings since 1) it is more readable and 2) the pkl files can't be loaded correctly, if people import slue_toolkit outside (not from the root of this repo)

Fix the train_subset, valid_subset, and labels in the E2E baseline scripts
opened by fwu-asapp 6

Issues with baseline scripts

Hi,

I am running into an error when trying to run the baseline script for voxpopuli:

Traceback (most recent call last):                                                                                                                                                                    
  File "~/bin/anaconda3/envs/fairseq/bin/fairseq-hydra-train", line 33, in <module>                                                                                       
    sys.exit(load_entry_point('fairseq', 'console_scripts', 'fairseq-hydra-train')())                                                                                                                 
  File "~/repos/fairseq/fairseq_cli/hydra_train.py", line 87, in cli_main                                                                                                 
    hydra_main()                                                                                                                                                                                      
  File "~/bin/anaconda3/envs/fairseq/lib/python3.8/site-packages/hydra/main.py", line 32, in decorated_main                                                               
    _run_hydra(                                                                                                                                                                                       
  File "~/bin/anaconda3/envs/fairseq/lib/python3.8/site-packages/hydra/_internal/utils.py", line 346, in _run_hydra                                                       
    run_and_report(                                                                                                                                                                                   
  File "~/bin/anaconda3/envs/fairseq/lib/python3.8/site-packages/hydra/_internal/utils.py", line 201, in run_and_report                                                   
    raise ex                                                                                                                                                                                          
  File "~/bin/anaconda3/envs/fairseq/lib/python3.8/site-packages/hydra/_internal/utils.py", line 198, in run_and_report                                                   
    return func()                                                                                                                                                                                     
  File "~/bin/anaconda3/envs/fairseq/lib/python3.8/site-packages/hydra/_internal/utils.py", line 347, in <lambda>                                                         
    lambda: hydra.run(                                                                                                                                                                                
  File "~/bin/anaconda3/envs/fairseq/lib/python3.8/site-packages/hydra/_internal/hydra.py", line 107, in run                                                              
    return run_job(                                                                                                                                                                                   
  File "~/bin/anaconda3/envs/fairseq/lib/python3.8/site-packages/hydra/core/utils.py", line 129, in run_job                                                               
    ret.return_value = task_function(task_cfg)                                                                                                                                                        
  File "~/repos/fairseq/fairseq_cli/hydra_train.py", line 27, in hydra_main                                                                                               
    _hydra_main(cfg)                                                                                                                                                                                  
  File "~/repos/fairseq/fairseq_cli/hydra_train.py", line 56, in _hydra_main                                                                                              
    distributed_utils.call_main(cfg, pre_main, **kwargs)                                                                                                                                              
  File "~/repos/fairseq/fairseq/distributed/utils.py", line 369, in call_main                                                                                             
    main(cfg, **kwargs)                                                                                                                                                                               
  File "~/repos/fairseq/fairseq_cli/train.py", line 164, in main                                                                                                          
    extra_state, epoch_itr = checkpoint_utils.load_checkpoint(                                                                                                                                        
  File "~/repos/fairseq/fairseq/checkpoint_utils.py", line 272, in load_checkpoint                                                                                        
    epoch_itr = trainer.get_train_iterator(                                                                                                                                                           
  File "~/repos/fairseq/fairseq/trainer.py", line 718, in get_train_iterator                                                                                              
    self.reset_dummy_batch(batch_iterator.first_batch)                                                                                                                                                
  File "~/repos/fairseq/fairseq/data/iterators.py", line 334, in first_batch                                                                                              
    return self.collate_fn([self.dataset[i] for i in self.frozen_batches[0]])                                                                                                                         
  File "~/repos/fairseq/fairseq/data/add_target_dataset.py", line 68, in collater                                                                                         
    collated["net_input"]["prev_output_tokens"],                                                                                                                                                      
KeyError: 'prev_output_tokens'

I have installed fairseq as recommended on the github page. Here is the installed version:

fairseq                   1.0.0a0+0f078de           dev_0    <develop>

Any pointers to solve this error?

Thank you

opened by qmeeus 2

Text NER Evaluation Pipeline doesn't seem to work
I am trying to run baselines/ner/nlp_scripts/eval-deberta.sh but it seems to be broken quite a bit.

I have fixed some of the bugs, in the PR (https://github.com/asappresearch/slue-toolkit/pull/7) but there seems to be some more, which I am unable to fix comfortably and would require someone with expertise of this code to have a look -

eval_obj.get_scores in def eval( of slue_toolkit/text_ner/ner_deberta.py seems to be passing asr_val_dataset which is set to None when eval_asr is set to False.

This then causes an issue in def get_scores function in slue_toolkit/text_ner/ner_deberta_modules.py. As the run_inference invoked by get_scores uses asr_val_dataset in their Dataloader.

The self.get_entity_tags( call in the run_inference function when eval_asr is set to False is also broken as this calls self.get_tag_map(indices=True) which seems to be calling an undefined variable tag in tag2id_raw[pfx + tag]

Could you please review that PR and also suggest the changes for the above errors. I have kept the PR as [WIP] you can make edits to them as you feel fit.

Thanks Sid
opened by siddalmia 2
Fix text ner
Hi,

The text NER pipelines had a few bugs. For the command bash baselines/ner/nlp_scripts/ft-deberta.sh deberta-base raw

I have described them below and also provided the fix -

It was using transformers and datasets library by hugging face, which I added to setup.py. Also added seqeval which was being used by datasets.

Since datasets is the folder where the data was being downloaded. I have now renamed it to dataset because it conflicts with the library that the text NER uses.

def train_module only uses train and val dataset but inside the function it was using eval_dataset which is not defined. I have fixed it to dev dataset.

def align_labels has a bug. As the tag2id dictionary is created using the training data, but the validation data actually has tags which are not there. I have now mapped them as 'O'.

I would highly recommend to test other parts of the toolkit! There seems to be many minor mistakes.

Thanks Sid
opened by siddalmia 2
Command to run E2E model of NER directly/only by speech model?

Hi, I have seen the readme of NER in https://github.com/he159ok/slue-toolkit

But I do not see a command to run a NER model directly/only by the speech model.

May I know a command to run it?

opened by he159ok 1
About submission

I sent my submission of the test set evaluation to "[email protected]" , but there was no reply. I do not know whether I sent the wrong email address or other reasons.

opened by RuizhuoXu 1
Voxceleb evaluation
Hi, I have some doubts regarding which data can be used for pretraining the models. I plan to do a mixture of self-supervised and supervised pretraining and I wanted to know:

Can I use Voxceleb1 audios for pretraining? I would only use the speaker id and nationality labels as supervision, and some self-supervision (without labels), so my model would be agnostic about the sentiment annotations, but maybe could have some advantage to differentiate the Voxceleb1 speakers during finetuning, specially if there is an imbalance in the sentiments by speaker.

Did you follow the same original dev/test splits from Voxceleb1? I am pretraining only with the dev split, so if the sentiment analysis task is evaluated only on the test split, it would not be a problem. Am I right?
opened by mrpep 1
Black for consistent formatting
I ran black to bring to a consistent formatting. Regarding the issue - https://github.com/asappresearch/slue-toolkit/issues/3

black --version black, version 19.10b0
opened by siddalmia 1
Plans to release ASR finetuned-models

Hi,

Thanks for the toolkit! I was wondering if there are plans to release the ASR finetuned models (or it is already there but I missed it). If not, are you accepting PRs on the ASR finetuned models by the community? Thanks in advance!

Jeff Hsu

opened by Splend1d 1

Owner

ASAPP Research

AI for Enterprise

GitHub

HeatNet is a python package that provides tools to build, train and evaluate neural networks designed to predict extreme heat wave events globally on daily to subseasonal timescales.

HeatNet HeatNet is a python package that provides tools to build, train and evaluate neural networks designed to predict extreme heat wave events glob

6 Jul 7, 2022

Ever felt tired after preprocessing the dataset, and not wanting to write any code further to train your model? Ever encountered a situation where you wanted to record the hyperparameters of the trained model and able to retrieve it afterward? Models Playground is here to help you do that. Models playground allows you to train your models right from the browser.

Models Playground ??️ Upload a Preprocessed Dataset ?? Choose whether to perform Classification or Regression ?? Enter the Dependent Variable ?

19 Dec 10, 2022

Ludwig is a toolbox that allows to train and evaluate deep learning models without the need to write code.

Translated in ???? Korean/ Ludwig is a toolbox that allows users to train and test deep learning models without the need to write code. It is built on

8.7k Jan 5, 2023

Ludwig is a toolbox that allows to train and evaluate deep learning models without the need to write code.

Translated in ???? Korean/ Ludwig is a toolbox that allows users to train and test deep learning models without the need to write code. It is built on

8.7k Dec 31, 2022

TorchGeo is a PyTorch domain library, similar to torchvision, that provides datasets, transforms, samplers, and pre-trained models specific to geospatial data.

1.3k Dec 30, 2022

Train/evaluate a Keras model, get metrics streamed to a dashboard in your browser.

Hera Train/evaluate a Keras model, get metrics streamed to a dashboard in your browser. Setting up Step 1. Plant the spy Install the package pip

495 Dec 10, 2022

Image-retrieval-baseline - MUGE Multimodal Retrieval Baseline

MUGE Multimodal Retrieval Baseline This repo is implemented based on the open_cl

47 Dec 16, 2022

Image-generation-baseline - MUGE Text To Image Generation Baseline

MUGE Text To Image Generation Baseline Requirements and Installation More detail

23 Oct 17, 2022

Jingju baseline - A baseline model of our project of Beijing opera script generation

Jingju Baseline It is a baseline of our project about Beijing opera script gener

1 Jan 14, 2022

We evaluate our method on different datasets (including ShapeNet, CUB-200-2011, and Pascal3D+) and achieve state-of-the-art results, outperforming all the other supervised and unsupervised methods and 3D representations, all in terms of performance, accuracy, and training time.

An Effective Loss Function for Generating 3D Models from Single 2D Image without Rendering Papers with code | Paper Nikola Zubić Pietro Lio University

213 Dec 27, 2022

Metrics to evaluate quality and efficacy of synthetic datasets.

An Open Source Project from the Data to AI Lab, at MIT Metrics for Synthetic Data Generation Projects Website: https://sdv.dev Documentation: https://

129 Jan 3, 2023

OCTIS: Comparing Topic Models is Simple! A python package to optimize and evaluate topic models (accepted at EACL2021 demo track)

OCTIS : Optimizing and Comparing Topic Models is Simple! OCTIS (Optimizing and Comparing Topic models Is Simple) aims at training, analyzing and compa

478 Jan 1, 2023

A novel method to tune language models. Codes and datasets for paper ``GPT understands, too''.

P-tuning A novel method to tune language models. Codes and datasets for paper ``GPT understands, too''. How to use our code We have released the code

562 Dec 27, 2022

We envision models that are pre-trained on a vast range of domain-relevant tasks to become key for molecule property prediction

We envision models that are pre-trained on a vast range of domain-relevant tasks to become key for molecule property prediction. This repository aims to give easy access to state-of-the-art pre-trained models.

90 Jan 8, 2023

This toolkit provides codes to download and pre-process the SLUE datasets, train the baseline models, and evaluate SLUE tasks.

Related tags

Overview

slue-toolkit

News

Installation

SLUE Tasks

Automatic Speech Recognition (ASR)

Named Entity Recognition (NER)

Sentiment Analysis (SA)

Datasets

Download and pre-process dataset

SLUE score evaluation

Model development rule

Baselines

ASR

Fine-tuning

Evaluation

NER

Fine-tuning End-to-end model

Evaluating End-to-End model

Sentiment Analysis

Fine-tuning

Evaluation

Comments

Owner

ASAPP Research

HeatNet is a python package that provides tools to build, train and evaluate neural networks designed to predict extreme heat wave events globally on daily to subseasonal timescales.

Ludwig is a toolbox that allows to train and evaluate deep learning models without the need to write code.

Ludwig is a toolbox that allows to train and evaluate deep learning models without the need to write code.

TorchGeo is a PyTorch domain library, similar to torchvision, that provides datasets, transforms, samplers, and pre-trained models specific to geospatial data.

Train/evaluate a Keras model, get metrics streamed to a dashboard in your browser.

Image-retrieval-baseline - MUGE Multimodal Retrieval Baseline

Image-generation-baseline - MUGE Text To Image Generation Baseline

Jingju baseline - A baseline model of our project of Beijing opera script generation

We evaluate our method on different datasets (including ShapeNet, CUB-200-2011, and Pascal3D+) and achieve state-of-the-art results, outperforming all the other supervised and unsupervised methods and 3D representations, all in terms of performance, accuracy, and training time.

Metrics to evaluate quality and efficacy of synthetic datasets.

OCTIS: Comparing Topic Models is Simple! A python package to optimize and evaluate topic models (accepted at EACL2021 demo track)

A novel method to tune language models. Codes and datasets for paper ``GPT understands, too''.

We envision models that are pre-trained on a vast range of domain-relevant tasks to become key for molecule property prediction

A collection of pre-trained StyleGAN2 models trained on different datasets at different resolution.

STBP is a way to train SNN with datasets by Backward propagation.

A Pytorch implementation of MoveNet from Google. Include training code and pre-train model.

Code for CPM-2 Pre-Train

Code for CPM-2 Pre-Train