CRL_EGPG
Pytorch Implementation of Contrastive Representation Learning for Exemplar-Guided Paraphrase Generation
We use contrastive loss implemented by HobbitLong.
How to train
- download the dataset from here and put it to project directory.
You can directly use preprocessed dataset(data/: QQP-Pos, data2: ParaNMT)
Or process them(Quora and Para)
by your own throughquora_process.py
andpara_process.py
respectively.
If you take the second method, you need to set the variabletext_path
in the above two programs. python train.py --datasets quora --model_save_path directory_to_save_model
How to evaluate
- Firstly, generate the test target sentences by running
python evaluate --model_save_path your_saved_model --idx which_model_you_want_to_test
After running the command, you will find the generated target filetrg_genidx.txt
and corresponding exemplar fileexmidx.txt
- Follow the repository provided by malllabiisc.
and setup the evaluation code. Then run
python -m src.evaluation.eval -i path/trg_genidx.txt -r path/test_trg.txt -t path/exmidx.txt
change to the corresponding path
How to generate multiple paraphrases for one input
You can modify generate.py
or just run
python generate.py