Code for T-Few from "Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning"

Related tags

Deep Learning t-few
Overview

T-Few

This repository contains the official code for the paper: "Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning".

This method outperforms in-context learning with GPT-3 and achieves state-of-the-art on "RAFT".

Setup

First, create a virtual environment for the project and install all the requirments. (We use conda to manage environments. Be sure to install and initialize conda first.)

  1. Create a virtual environment with python 3.7 conda create -n tfew python==3.7, then activate the environment conda activate tfew.
  2. Install other dependencies. pip install -r requirements.txt -f https://download.pytorch.org/whl/cu113/torch_stable.html
  3. If you plan to run SAID, then install dependencies with python src/intrinsic_said_setup.py develop. Otherwise, skip this step.

The steps above only needs to be done once. In addition, every time you start a new session, you will need to run . bin/start.sh

Run your first experiment

Once you finished setting up the environment, you can try running CUDA_VISIBLE_DEVICES=3 python -m src.pl_train -c t0.json+rte.json -k save_model=False exp_name=first_exp The outputs of this run will be saved to ${OUTPUT_PATH}/first_exp/, which is usually /t-few/exp_out/first_exp/. Here, first_exp is the experiment name, you can run more experiments with different expeirment names. The code will automatically skip finished experiments. (However, if you wish to rerun a finished experiment under the same experiment name, you will need to manually remove the corresponding files in the output directory.)

There are two ways to control an experiment.

  1. You can specify config files with -c. Multiple config files can be combined with +. (When there are conflits, config terms from the config file on the right will have greater power.) This will be convinient when you have multiple terms that forms a fixed group.
  2. You can override values with -k. This will be convinient when you need to change a small number of terms.

It is recommended to use GPUs with 40GB to train T0(3B) and 80GB to train T0

Run an array of experiments

In this project, we often need to run a large number of experiments. Here is an example bash script bin/few-shot-pretrained-3b-100k.sh to fine-tune 3B pre-trained (IA)3 on all datasets.

This should take a few hours. After that, you can use scripts/get_results_table.py to generate a csv summary.

Citation

If you find this repo helpful, welcome to cite our work:

@article{liu2020tfew,
  title={Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning},
  author={Liu, Haokun and Tam, Derek and Muqeeth, Mohammed and Mohta, Jay and Huang, Tenghao and Bansal, Mohit and Raffel, Colin},
  journal={arXiv preprint arXiv:2205.05638},
  year={2022}
}

We use the following code in our works:

@article{mahabadi2021compacter,
  title={Compacter: Efficient low-rank hypercomplex adapter layers},
  author={Mahabadi, Rabeeh Karimi and Henderson, James and Ruder, Sebastian},
  journal={arXiv preprint arXiv:2106.04647},
  year={2021}
}

@article{sung2021training,
  title={Training Neural Networks with Fixed Sparse Masks},
  author={Sung, Yi-Lin and Nair, Varun and Raffel, Colin},
  journal={arXiv preprint arXiv:2111.09839},
  year={2021}
}

@article{aghajanyan2020intrinsic,
  title={Intrinsic dimensionality explains the effectiveness of language model fine-tuning},
  author={Aghajanyan, Armen and Zettlemoyer, Luke and Gupta, Sonal},
  journal={arXiv preprint arXiv:2012.13255},
  year={2020}
}
Comments
  • Multi-GPU Support

    Multi-GPU Support

    Hello,

    Have you tried training on Multi-GPU setup? I tried running your fine-tuning example like so:

    export CUDA_VISIBLE_DEVICES=0,1
    python -m src.pl_train -c t03b.json+ia3.json+rte.json -k load_weight="pretrained_checkpoints/t03b_ia3_finish.pt" exp_name=t03b_rte_seed42_ia3_pretrained100k few_shot_random_seed=42 seed=42
    

    But I get errors in the lightning data loaders.

    Any Ideas? Thank you

    opened by danielkorat 8
  • Can't run 11 billion model on A100 with 80GB

    Can't run 11 billion model on A100 with 80GB

    Hi @craffel @muqeeth @HaokunLiu,

    We're trying to reproduce T-Few results for a paper, but we're getting 'CUDA out of memory' using an A100 with 80GB (your recommended setup).

    This is what we're running:

    python -m src.pl_train -c t011b.json+ia3.json+rte.json -k load_weight="pretrained_checkpoints/t011b_ia3_finish.pt" exp_name=t011b_rte_seed42_ia3_pretrained few_shot_random_seed=42 seed=42
    

    We installed according to the README instructions and are using the default settings in the config files. We are able to run the 3 billion model using the command above, just not the 11 billion. Is there anything we are doing wrong?

    This is the exception:

    CUDA out of memory

    Thank you

    opened by danielkorat 5
  • KeyError: 'HF_HOME'

    KeyError: 'HF_HOME'

    Hi! I was trying to run the example in README, but it says KeyError: 'HF_HOME' This is the script I used: python -m src.pl_train -c t03b.json+rte.json -k save_model=False exp_name=first_exp I can't find anywhere in the code that sets the value of this environment variable.

    Mark experiment first_exp as claimed
    [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
    [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
    Using bfloat16 Automatic Mixed Precision (AMP)
    GPU available: False, used: False
    TPU available: False, using: 0 TPU cores
    IPU available: False, using: 0 IPUs
    Traceback (most recent call last):
      File "/Users/weiqiuyou/opt/miniconda3/envs/tfew/lib/python3.7/runpy.py", line 193, in _run_module_as_main
        "__main__", mod_spec)
      File "/Users/weiqiuyou/opt/miniconda3/envs/tfew/lib/python3.7/runpy.py", line 85, in _run_code
        exec(code, run_globals)
      File "/Users/weiqiuyou/Documents/codebases/t-few/src/pl_train.py", line 86, in <module>
        main(config)
      File "/Users/weiqiuyou/Documents/codebases/t-few/src/pl_train.py", line 57, in main
        trainer.fit(model, datamodule)
      File "/Users/weiqiuyou/opt/miniconda3/envs/tfew/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 741, in fit
        self._fit_impl, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path
      File "/Users/weiqiuyou/opt/miniconda3/envs/tfew/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 685, in _call_and_handle_interrupt
        return trainer_fn(*args, **kwargs)
      File "/Users/weiqiuyou/opt/miniconda3/envs/tfew/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 777, in _fit_impl
        self._run(model, ckpt_path=ckpt_path)
      File "/Users/weiqiuyou/opt/miniconda3/envs/tfew/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1131, in _run
        self._data_connector.prepare_data()
      File "/Users/weiqiuyou/opt/miniconda3/envs/tfew/lib/python3.7/site-packages/pytorch_lightning/trainer/connectors/data_connector.py", line 154, in prepare_data
        self.trainer.datamodule.prepare_data()
      File "/Users/weiqiuyou/opt/miniconda3/envs/tfew/lib/python3.7/site-packages/pytorch_lightning/core/datamodule.py", line 474, in wrapped_fn
        fn(*args, **kwargs)
      File "/Users/weiqiuyou/Documents/codebases/t-few/src/data/data_module.py", line 17, in prepare_data
        _ = self.dataset_reader.read_few_shot_dataset()
      File "/Users/weiqiuyou/Documents/codebases/t-few/src/data/dataset_readers.py", line 164, in read_few_shot_dataset
        orig_data = self.read_orig_dataset("train")
      File "/Users/weiqiuyou/Documents/codebases/t-few/src/data/dataset_readers.py", line 146, in read_orig_dataset
        orig_data = load_dataset(*self.dataset_stash, split=split, cache_dir=os.environ["HF_HOME"])
      File "/Users/weiqiuyou/opt/miniconda3/envs/tfew/lib/python3.7/os.py", line 678, in __getitem__
        raise KeyError(key) from None
    KeyError: 'HF_HOME'
    
    opened by fallcat 4
  • Validation score on WSC decreases with training

    Validation score on WSC decreases with training

    Thank you for the amazing work on t-few! I've noticed strange behavior when I am running superglue's wsc. I've been logging the validation score every 40 epochs using self.eval_epoch_interval = 40 and when running the command: python -m src.pl_train -c ia3.json+wsc.json -k save_model=False exp_name=first_exp the output is as following:

    {"accuracy": 0.6730769230769231, "score_gt": 0.5068197436630726, "score_cand": 0.7191649047801127} {"accuracy": 0.49038461538461536, "score_gt": 1.4563168384707892, "score_cand": 1.505529030584372} {"accuracy": 0.47115384615384615, "score_gt": 3.4743554890155792, "score_cand": 2.727144861450562} {"accuracy": 0.46153846153846156, "score_gt": 4.202766236777489, "score_cand": 3.5702959763316007} {"accuracy": 0.40384615384615385, "score_gt": 5.157541000499175, "score_cand": 3.5657502871293287} {"accuracy": 0.3942307692307692, "score_gt": 5.397989429533482, "score_cand": 3.975659689651086} {"accuracy": 0.40384615384615385, "score_gt": 5.073869264469697, "score_cand": 3.995581218542961}

    The last accuracy score is reported at 240 epochs out of a total 250 epochs.

    Any ideas on what is going on here? Thanks!

    opened by Raldir 3
  • To which epoch/training step does the finish.pt checkpoint belong to?

    To which epoch/training step does the finish.pt checkpoint belong to?

    Hi everyone!

    When I run the experiments after eval_epoch_interval's the model is validated and a checkpoint is written out as global_stepXXXXX.pt. At the end there is also a final checkpoint written out named finish.pt. I assumed this one either belongs to the best intermediate validation performance or the last epoch. However, from comparing it with the other checkpoints that were created it seems that finish.pt differs from all global_stepXXXXX.pt checkpoints, so I am wondering to which point in training does the finish.pt belong to?

    Sorry if I miss something obvious here.

    Best, Stefan

    opened by stefanhgm 3
  • Sum of logprobs in the probability space adds up to values above 1

    Sum of logprobs in the probability space adds up to values above 1

    Hi! Congratulations on this great work and thank you for putting up such an easy to use framework! It definitely facilitates research quite a bit :)

    I was trying to interpret the scores logged during the evaluation of the development set and I realized that sometimes when summing the scores of the exponentiated negative of the scores for GT and CAND results in a sum bigger than 1 for two class datasets (like RTE). Maybe I'm interpreting these scores wrongly since I was expecting the sum of the scores (after converting them to probability space (that is np.exp(-1 * logprob)) to be less than or equal to 1 for two class datasets.

    Would you let me know if my rationale is flawed and if or why the sum of the probabilities may be above 1?

    Thank you in advance!

    opened by PastelBelem8 2
  • AttributeError: Can't pickle local object 'create_collate_fn.<locals>.collate_fn'

    AttributeError: Can't pickle local object 'create_collate_fn..collate_fn'

    When I tried to run the demo, I found this error! @dptam @jmohta @muqeeth

    Using bfloat16 Automatic Mixed Precision (AMP)
    GPU available: False, used: False
    TPU available: False, using: 0 TPU cores
    IPU available: False, using: 0 IPUs
    HPU available: False, using: 0 HPUs
    WARNING:datasets.builder:Reusing dataset super_glue (/Users/caffrey/Documents/research/t-few-master/cache/super_glue/rte/1.0.2/d040c658e2ddef6934fdd97deb45c777b6ff50c524781ea434e7219b56a428a7)
    Missing logger folder: exp_out/first_exp/log
    WARNING:datasets.builder:Reusing dataset super_glue (/Users/caffrey/Documents/research/t-few-master/cache/super_glue/rte/1.0.2/d040c658e2ddef6934fdd97deb45c777b6ff50c524781ea434e7219b56a428a7)
    Train size 32
    Eval size 277
    
      | Name  | Type                       | Params
    -----------------------------------------------------
    0 | model | T5ForConditionalGeneration | 2.8 B 
    -----------------------------------------------------
    2.8 B     Trainable params
    0         Non-trainable params
    2.8 B     Total params
    11,399.029Total estimated model params size (MB)
    Sanity Checking: 0it [00:00, ?it/s]Traceback (most recent call last):
      File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/runpy.py", line 197, in _run_module_as_main
        return _run_code(code, main_globals, None,
      File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/runpy.py", line 87, in _run_code
        exec(code, run_globals)
      File "/Users/caffrey/Documents/paper/t-few-master/src/pl_train.py", line 86, in <module>
        main(config)
      File "/Users/caffrey/Documents/paper/t-few-master/src/pl_train.py", line 57, in main
        trainer.fit(model, datamodule)
      File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 770, in fit
        self._call_and_handle_interrupt(
      File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 723, in _call_and_handle_interrupt
        return trainer_fn(*args, **kwargs)
      File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 811, in _fit_impl
        results = self._run(model, ckpt_path=self.ckpt_path)
      File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1236, in _run
        results = self._run_stage()
      File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1323, in _run_stage
        return self._run_train()
      File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1345, in _run_train
        self._run_sanity_check()
      File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1413, in _run_sanity_check
        val_loop.run()
      File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/site-packages/pytorch_lightning/loops/base.py", line 204, in run
        self.advance(*args, **kwargs)
      File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 155, in advance
        dl_outputs = self.epoch_loop.run(self._data_fetcher, dl_max_batches, kwargs)
      File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/site-packages/pytorch_lightning/loops/base.py", line 199, in run
        self.on_run_start(*args, **kwargs)
      File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/site-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 88, in on_run_start
        self._data_fetcher = iter(data_fetcher)
      File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/site-packages/pytorch_lightning/utilities/fetching.py", line 178, in __iter__
        self.dataloader_iter = iter(self.dataloader)
      File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 443, in __iter__
        return self._get_iterator()
      File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 389, in _get_iterator
        return _MultiProcessingDataLoaderIter(self)
      File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1062, in __init__
        w.start()
      File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/multiprocessing/process.py", line 121, in start
        self._popen = self._Popen(self)
      File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/multiprocessing/context.py", line 224, in _Popen
        return _default_context.get_context().Process._Popen(process_obj)
      File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/multiprocessing/context.py", line 284, in _Popen
        return Popen(process_obj)
      File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/multiprocessing/popen_spawn_posix.py", line 32, in __init__
        super().__init__(process_obj)
      File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/multiprocessing/popen_fork.py", line 19, in __init__
        self._launch(process_obj)
      File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/multiprocessing/popen_spawn_posix.py", line 47, in _launch
        reduction.dump(process_obj, fp)
      File "/Users/caffrey/miniforge3/envs/tongji/lib/python3.9/multiprocessing/reduction.py", line 60, in dump
        ForkingPickler(file, protocol).dump(obj)
    AttributeError: Can't pickle local object 'create_collate_fn.<locals>.collate_fn'
    
    
    opened by CaffreyR 2
  • Where are performance results of experiments stored

    Where are performance results of experiments stored

    Hi,

    thank you very much for sharing your code!

    I ran the example from the readme and the parts of the few-shot-pretrained-3b-100k.sh script. However, the dev_scores.json for the readme example only contains the line:

    {"accuracy": 0.6101083032490975, "score_gt": 0.3983679488032303, "score_cand": 0.6958685107394676}

    And for t03b_copa_seed42_ia3_pretrained100k (the first experiment of few-shot-pretrained-3b-100k.sh):

    {"accuracy": 0.85, "score_gt": 0.06061243396921782, "score_cand": 0.4640417302213609}

    Those are just the results of the "Validation sanity check" right at the beginning, so I wondered where the validation results after each epoch are stored or am I missing something here?

    Thanks!

    opened by stefanhgm 2
  • Clarification about IA^3

    Clarification about IA^3

    Hi :)

    I was reading your interesting paper https://arxiv.org/pdf/2205.05638.pdf.

    In Section 3.3, you specify that IA^3 adds a total of d_k + d_v + d_ff parameters.

    However, if I look at this line, you seem to be allocating 2 * d vectors for each linear layer (multi_lora_a, multi_lora_b) and multiplying multi_lora_a with the input and multi_lora_b with the transformed input.

    https://github.com/r-three/t-few/blob/9dbc9cc429888a0c27fc22188b4e9549e0e83f40/src/models/lora.py#L43

    Am I missing something?

    Thank you for your clarification :-)

    opened by sordonia 2
  • AttributeError: 'DistributedDataParallel' object has no attribute 'save_checkpoint'

    AttributeError: 'DistributedDataParallel' object has no attribute 'save_checkpoint'

    @HaokunLiu @dptam Thank you for your great work and congrats on the neurips acceptance!

    I have got an issue when using ddp as follows: AttributeError: 'DistributedDataParallel' object has no attribute 'save_checkpoint'

    It's raised by the following line: https://github.com/r-three/t-few/blob/4e581fa0b8f53e36da252a15bd581d365d4dd333/src/models/EncoderDecoder.py#L305

    Any suggestion would be appreciated!

    Another related question is why the ddp ckpt also needs to be processed by zero_to_fp32.get_fp32_state_dict_from_zero_checkpoint(distributed_save_path)? I thought it should be applied to deepspeed zero ckpts only. This is done in: https://github.com/r-three/t-few/blob/4e581fa0b8f53e36da252a15bd581d365d4dd333/src/models/EncoderDecoder.py#L308

    opened by fwangut 1
  • Does pl_train.py support TPU training?

    Does pl_train.py support TPU training?

    Hello, I am interested in using T-Few recipe for some experiments with Google Cloud TPUs. I am wondering whether the pl_train.py script supports TPU already? I read in the Acknowledgments section of the paper the authors cite that TPU cloud was utilised, however in this script I can see that gpu is directly supported. Any pointers will be appreciated, particularly I would like to use T0–11b

    opened by kshre 1
  • save dev_pred.txt and test_pred.txt for RTE and ANLI

    save dev_pred.txt and test_pred.txt for RTE and ANLI

    Congrats on your great work! I am interested in analyzing the results of T0-3B + IA3's predictions on NLI tasks. I run the command python -m src.pl_train -c t03b.json+anli-r3.json+ia3.json -k exp_name=anli-r3 load_weight="pretrained_checkpoints/t03b_ia3_finish.pt" eval_epoch_interval=20 but only see the dev_scores.json file in the output. How can I also obtain the prediction file of the model? Thanks!

    opened by ruixiangcui 2
  • What does the multi_lora_a and multi_lora_b mean in the code?

    What does the multi_lora_a and multi_lora_b mean in the code?

    Hi, may I ask that what does the multi_lora_a mean ? Is there any paper that has explained it ? Many thanks!

    https://github.com/r-three/t-few/blob/43fdb511f3278ac93d37fed4f5a7cbd965408e99/src/models/lora.py#L22

    opened by CaffreyR 3
  • Accuracy could not match with the log when load_model

    Accuracy could not match with the log when load_model

    Hi, @muqeeth @dptam @craffel , when I set the eval_epoch_interval=1. I have some accuracy in my log, and I save my model and checkpoint. But when I tried to reload the model, its accuracy did not match the accuracy. image image

    opened by CaffreyR 10
  • ImportError: cannot import name 'fast_walsh_hadamard_transform' from 'src.models.fwh_cuda' (unknown location)

    ImportError: cannot import name 'fast_walsh_hadamard_transform' from 'src.models.fwh_cuda' (unknown location)

    I tried running the example from the README and got this error. Can you help?

    $ CUDA_VISIBLE_DEVICES=3 python -m src.pl_train -c t0.json+rte.json -k save_model=False exp_name=first_exp
    Traceback (most recent call last):
      File "/home/james/.conda/envs/tfew/lib/python3.7/runpy.py", line 193, in _run_module_as_main
        "__main__", mod_spec)
      File "/home/james/.conda/envs/tfew/lib/python3.7/runpy.py", line 85, in _run_code
        exec(code, run_globals)
      File "/home/james/github/t-few/src/pl_train.py", line 10, in <module>
        from src.models.EncoderDecoder import EncoderDecoder
      File "/home/james/github/t-few/src/models/EncoderDecoder.py", line 11, in <module>
        from .intrinsic import intrinsic_plugin_on_step
      File "/home/james/github/t-few/src/models/intrinsic.py", line 10, in <module>
        from .fwh_cuda import fast_walsh_hadamard_transform as fast_walsh_hadamard_transform_cuda
    ImportError: cannot import name 'fast_walsh_hadamard_transform' from 'src.models.fwh_cuda' (unknown location)
    
    opened by CrazyPython 5
  • Releasing evaluation log probabilites

    Releasing evaluation log probabilites

    Hi, thanks for open-sourcing model code! Could you release the log probabilities for evaluation tasks (i.e., the model probabilities for valid answers for each prompt on each question for all evaluated datasets)? This data would allow for for fine-grained evaluation of models and comparing against other LLMs.

    cf. https://github.com/facebookresearch/metaseq/issues/25

    opened by tholiao 1
Owner
null
Code for our method RePRI for Few-Shot Segmentation. Paper at http://arxiv.org/abs/2012.06166

Region Proportion Regularized Inference (RePRI) for Few-Shot Segmentation In this repo, we provide the code for our paper : "Few-Shot Segmentation Wit

Malik Boudiaf 138 Dec 12, 2022
Ready-to-use code and tutorial notebooks to boost your way into few-shot image classification.

Easy Few-Shot Learning Ready-to-use code and tutorial notebooks to boost your way into few-shot image classification. This repository is made for you

Sicara 399 Jan 8, 2023
All the essential resources and template code needed to understand and practice data structures and algorithms in python with few small projects to demonstrate their practical application.

Data Structures and Algorithms Python INDEX 1. Resources - Books Data Structures - Reema Thareja competitiveCoding Big-O Cheat Sheet DAA Syllabus Inte

Shushrut Kumar 129 Dec 15, 2022
SparseML is a libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models

SparseML is a toolkit that includes APIs, CLIs, scripts and libraries that apply state-of-the-art sparsification algorithms such as pruning and quantization to any neural network. General, recipe-driven approaches built around these algorithms enable the simplification of creating faster and smaller models for the ML performance community at large.

Neural Magic 1.5k Dec 30, 2022
Code and data of the ACL 2021 paper: Few-Shot Text Ranking with Meta Adapted Synthetic Weak Supervision

MetaAdaptRank This repository provides the implementation of meta-learning to reweight synthetic weak supervision data described in the paper Few-Shot

THUNLP 5 Jun 16, 2022
Code for 'Self-Guided and Cross-Guided Learning for Few-shot segmentation. (CVPR' 2021)'

SCL Introduction Code for 'Self-Guided and Cross-Guided Learning for Few-shot segmentation. (CVPR' 2021)' We evaluated our approach using two baseline

null 34 Oct 8, 2022
This repository contains the code for using the H3DS dataset introduced in H3D-Net: Few-Shot High-Fidelity 3D Head Reconstruction

H3DS Dataset This repository contains the code for using the H3DS dataset introduced in H3D-Net: Few-Shot High-Fidelity 3D Head Reconstruction Access

Crisalix 72 Dec 10, 2022
Official code for "Simpler is Better: Few-shot Semantic Segmentation with Classifier Weight Transformer. ICCV2021".

Simpler is Better: Few-shot Semantic Segmentation with Classifier Weight Transformer. ICCV2021. Introduction We proposed a novel model training paradi

Lucas 103 Dec 14, 2022
Official code release for "Learned Spatial Representations for Few-shot Talking-Head Synthesis" ICCV 2021

Official code release for "Learned Spatial Representations for Few-shot Talking-Head Synthesis" ICCV 2021

Moustafa Meshry 16 Oct 5, 2022
This repository is the code of the paper "Sparse Spatial Transformers for Few-Shot Learning".

?? Sparse Spatial Transformers for Few-Shot Learning This code implements the Sparse Spatial Transformers for Few-Shot Learning(SSFormers). Our code i

chx_nju 38 Dec 13, 2022
Python wrapper class for OpenVINO Model Server. User can submit inference request to OVMS with just a few lines of code

Python wrapper class for OpenVINO Model Server. User can submit inference request to OVMS with just a few lines of code.

Yasunori Shimura 7 Jul 27, 2022
The code is for the paper "A Self-Distillation Embedded Supervised Affinity Attention Model for Few-Shot Segmentation"

SD-AANet The code is for the paper "A Self-Distillation Embedded Supervised Affinity Attention Model for Few-Shot Segmentation" [arxiv] Overview confi

cv516Buaa 9 Nov 7, 2022
The Pytorch code of "Joint Distribution Matters: Deep Brownian Distance Covariance for Few-Shot Classification", CVPR 2022 (Oral).

DeepBDC for few-shot learning        Introduction In this repo, we provide the implementation of the following paper: "Joint Distribution Matters: Dee

FeiLong 116 Dec 19, 2022
This is the official source code for SLATE. We provide the code for the model, the training code, and a dataset loader for the 3D Shapes dataset. This code is implemented in Pytorch.

SLATE This is the official source code for SLATE. We provide the code for the model, the training code and a dataset loader for the 3D Shapes dataset.

Gautam Singh 66 Dec 26, 2022
CharacterGAN: Few-Shot Keypoint Character Animation and Reposing

CharacterGAN Implementation of the paper "CharacterGAN: Few-Shot Keypoint Character Animation and Reposing" by Tobias Hinz, Matthew Fisher, Oliver Wan

Tobias Hinz 181 Dec 27, 2022
Face2webtoon - Despite its importance, there are few previous works applying I2I translation to webtoon.

Despite its importance, there are few previous works applying I2I translation to webtoon. I collected dataset from naver webtoon 연애혁명 and tried to transfer human faces to webtoon domain.

이상윤 64 Oct 19, 2022
Few-shot Learning of GPT-3

Few-shot Learning With Language Models This is a codebase to perform few-shot "in-context" learning using language models similar to the GPT-3 paper.

Tony Z. Zhao 224 Dec 28, 2022
git《FSCE: Few-Shot Object Detection via Contrastive Proposal Encoding》(CVPR 2021) GitHub: [fig8]

FSCE: Few-Shot Object Detection via Contrastive Proposal Encoding (CVPR 2021) This repo contains the implementation of our state-of-the-art fewshot ob

null 233 Dec 29, 2022
Library of various Few-Shot Learning frameworks for text classification

FewShotText This repository contains code for the paper A Neural Few-Shot Text Classification Reality Check Environment setup # Create environment pyt

Thomas Dopierre 47 Jan 3, 2023