Pointer-generator - Code for the ACL 2017 paper Get To The Point: Summarization with Pointer-Generator Networks

Overview

Note: this code is no longer actively maintained. However, feel free to use the Issues section to discuss the code with other users. Some users have updated this code for newer versions of Tensorflow and Python - see information below and Issues section.


This repository contains code for the ACL 2017 paper Get To The Point: Summarization with Pointer-Generator Networks. For an intuitive overview of the paper, read the blog post.

Looking for test set output?

The test set output of the models described in the paper can be found here.

Looking for pretrained model?

A pretrained model is available here:

(The only difference between these two is the naming of some of the variables in the checkpoint. Tensorflow 1.0 uses lstm_cell/biases and lstm_cell/weights whereas Tensorflow 1.2.1 uses lstm_cell/bias and lstm_cell/kernel).

Note: This pretrained model is not the exact same model that is reported in the paper. That is, it is the same architecture, trained with the same settings, but resulting from a different training run. Consequently this pretrained model has slightly lower ROUGE scores than those reported in the paper. This is probably due to us slightly overfitting to the randomness in our original experiments (in the original experiments we tried various hyperparameter settings and selected the model that performed best). Repeating the experiment once with the same settings did not perform quite as well. Better results might be obtained from further hyperparameter tuning.

Why can't you release the trained model reported in the paper? Due to changes to the code between the original experiments and the time of releasing the code (e.g. TensorFlow version changes, lots of code cleanup), it is not possible to release the original trained model files.

Looking for CNN / Daily Mail data?

Instructions are here.

About this code

This code is based on the TextSum code from Google Brain.

This code was developed for Tensorflow 0.12, but has been updated to run with Tensorflow 1.0. In particular, the code in attention_decoder.py is based on tf.contrib.legacy_seq2seq_attention_decoder, which is now outdated. Tensorflow 1.0's new seq2seq library probably provides a way to do this (as well as beam search) more elegantly and efficiently in the future.

Python 3 version: This code is in Python 2. If you want a Python 3 version, see @becxer's fork.

How to run

Get the dataset

To obtain the CNN / Daily Mail dataset, follow the instructions here. Once finished, you should have chunked datafiles train_000.bin, ..., train_287.bin, val_000.bin, ..., val_013.bin, test_000.bin, ..., test_011.bin (each contains 1000 examples) and a vocabulary file vocab.

Note: If you did this before 7th May 2017, follow the instructions here to correct a bug in the process.

Run training

To train your model, run:

python run_summarization.py --mode=train --data_path=/path/to/chunked/train_* --vocab_path=/path/to/vocab --log_root=/path/to/a/log/directory --exp_name=myexperiment

This will create a subdirectory of your specified log_root called myexperiment where all checkpoints and other data will be saved. Then the model will start training using the train_*.bin files as training data.

Warning: Using default settings as in the above command, both initializing the model and running training iterations will probably be quite slow. To make things faster, try setting the following flags (especially max_enc_steps and max_dec_steps) to something smaller than the defaults specified in run_summarization.py: hidden_dim, emb_dim, batch_size, max_enc_steps, max_dec_steps, vocab_size.

Increasing sequence length during training: Note that to obtain the results described in the paper, we increase the values of max_enc_steps and max_dec_steps in stages throughout training (mostly so we can perform quicker iterations during early stages of training). If you wish to do the same, start with small values of max_enc_steps and max_dec_steps, then interrupt and restart the job with larger values when you want to increase them.

Run (concurrent) eval

You may want to run a concurrent evaluation job, that runs your model on the validation set and logs the loss. To do this, run:

python run_summarization.py --mode=eval --data_path=/path/to/chunked/val_* --vocab_path=/path/to/vocab --log_root=/path/to/a/log/directory --exp_name=myexperiment

Note: you want to run the above command using the same settings you entered for your training job.

Restoring snapshots: The eval job saves a snapshot of the model that scored the lowest loss on the validation data so far. You may want to restore one of these "best models", e.g. if your training job has overfit, or if the training checkpoint has become corrupted by NaN values. To do this, run your train command plus the --restore_best_model=1 flag. This will copy the best model in the eval directory to the train directory. Then run the usual train command again.

Run beam search decoding

To run beam search decoding:

python run_summarization.py --mode=decode --data_path=/path/to/chunked/val_* --vocab_path=/path/to/vocab --log_root=/path/to/a/log/directory --exp_name=myexperiment

Note: you want to run the above command using the same settings you entered for your training job (plus any decode mode specific flags like beam_size).

This will repeatedly load random examples from your specified datafile and generate a summary using beam search. The results will be printed to screen.

Visualize your output: Additionally, the decode job produces a file called attn_vis_data.json. This file provides the data necessary for an in-browser visualization tool that allows you to view the attention distributions projected onto the text. To use the visualizer, follow the instructions here.

If you want to run evaluation on the entire validation or test set and get ROUGE scores, set the flag single_pass=1. This will go through the entire dataset in order, writing the generated summaries to file, and then run evaluation using pyrouge. (Note this will not produce the attn_vis_data.json files for the attention visualizer).

Evaluate with ROUGE

decode.py uses the Python package pyrouge to run ROUGE evaluation. pyrouge provides an easier-to-use interface for the official Perl ROUGE package, which you must install for pyrouge to work. Here are some useful instructions on how to do this:

Note: As of 18th May 2017 the website for the official Perl package appears to be down. Unfortunately you need to download a directory called ROUGE-1.5.5 from there. As an alternative, it seems that you can get that directory from here (however, the version of pyrouge in that repo appears to be outdated, so best to install pyrouge from the official source).

Tensorboard

Run Tensorboard from the experiment directory (in the example above, myexperiment). You should be able to see data from the train and eval runs. If you select "embeddings", you should also see your word embeddings visualized.

Help, I've got NaNs!

For reasons that are difficult to diagnose, NaNs sometimes occur during training, making the loss=NaN and sometimes also corrupting the model checkpoint with NaN values, making it unusable. Here are some suggestions:

  • If training stopped with the Loss is not finite. Stopping. exception, you can just try restarting. It may be that the checkpoint is not corrupted.
  • You can check if your checkpoint is corrupted by using the inspect_checkpoint.py script. If it says that all values are finite, then your checkpoint is OK and you can try resuming training with it.
  • The training job is set to keep 3 checkpoints at any one time (see the max_to_keep variable in run_summarization.py). If your newer checkpoint is corrupted, it may be that one of the older ones is not. You can switch to that checkpoint by editing the checkpoint file inside the train directory.
  • Alternatively, you can restore a "best model" from the eval directory. See the note Restoring snapshots above.
  • If you want to try to diagnose the cause of the NaNs, you can run with the --debug=1 flag turned on. This will run Tensorflow Debugger, which checks for NaNs and diagnoses their causes during training.
Comments
  • Failure to replicate results

    Failure to replicate results

    Has anyone been able to successfully replicate the model from the paper? I've been training for about two weeks (over 240k iterations) using the published parameters, along with training an additional ~3k iterations with coverage. Here is what my training loss looks like:

    image

    Unclear what caused the increase starting around iteration 180k, but even then the output was not looking great.

    Here are some REF (gold) and DEC (system) summaries. As you can see, they are qualitatively bad. Unfortunately, at the moment, I can't figure out how to get pyrouge to run so I can't quantify the performance relative to the published results.

    000000_reference.txt

    REF: marseille prosecutor says so far no videos were used in the crash investigation '' despite media reports . journalists at bild and paris match are very confident '' the video clip is real , an editor says . andreas lubitz had informed his lufthansa training school of an episode of severe depression , airline says .

    DEC: robin 's comments are aware of any video footage , german paris match . he 's accused into the crash of germanwings flight 9525 flight . prosecutor : `` it is a very disturbing scene ''

    000001_reference.txt

    REF: membership gives the icc jurisdiction over alleged crimes committed in palestinian territories since last june . israel and the united states opposed the move , which could open the door to war crimes investigations against israelis .

    DEC: palestinians signed icc 's founding rome statute of alleged crimes in palestinian territories . israel says `` in the occupied palestinian territory to immediately end and injustice , she says . it 's founding rome .

    000002_reference.txt

    REF: amnesty 's annual death penalty report catalogs encouraging signs , but setbacks in numbers of those sentenced to death . organization claims that governments around the world are using the threat of terrorism to advance executions . the number of executions worldwide has gone down by almost 22 % compared with 2013 , but death sentences up by 28 % .

    DEC: it 's death sentences and executions 2014 '' is some we are closer to abolition , to advance executions '' number of deterrence , `` a number of countries are abolitionist '' amnesty says he would not be used for the death penalty .

    000003_reference.txt

    REF: amnesty international releases its annual review of the death penalty worldwide ; much of it makes for grim reading . salil shetty : countries that use executions to deal with problems are on the wrong side of history .

    DEC: soldiers who a china agreed to tackle a surge in death sentences to death . jordan ended china 's public mass sentencing is part a china 's northwestern xinjiang region . a sharp spike in december 2006 , 2014 , 2014 .

    000004_reference.txt

    REF: museum : anne frank died earlier than previously believed . researchers re-examined archives and testimonies of survivors . anne and older sister margot frank are believed to have died in february 1945 .

    DEC: bergen-belsen concentration camp is believed death on march 31 , anne frank says . four the jewish diarist concentration camp , margot , margot , violent , died at the age of 15 . `` i am no more than a skeleton camp , '' witnesses say .

    If anyone has had success reproducing the published model I would love to hear how you did it. I'm stumped.

    question 
    opened by hate5six 28
  • Can you share the Model file for testing on my side?

    Can you share the Model file for testing on my side?

    Dear repository maker, I would like to test the results on my side. Will it be possible for you to share the trained model of tensorflow on which you have tested the results? I have checked the output and found the results interesting. Kindly, let me know whether you can share the trained model, so that if I need to train the model I will start from the last stop, i.e. from your model. Do let me know.

    question 
    opened by JafferWilson 14
  • Training error: is not fully defined."">

    Training error: "TensorArray has size zero, but element shape is not fully defined."

    Hello,

    After running the code for about half an hour in train mode, the following error stopped the training process:

    TensorArray has size zero, but element shape is not fully defined. Currently only static shapes are supported when packing zero-size TensorArrays.

    The error appeared when the code was run with the following parameters: CUDA_VISIBLE_DEVICES=5 python run_summarization.py --mode=train --data_path=/path-to-train/train.bin --vocab_path=/path-to-vocab/vocab --log_root=/path-to-log/log --exp_name=myexperiment2 --max_enc_steps=100 --max_dec_steps=50 --lr=0.01 The error also occurred after some hours when the learning rate was not specified.

    The full traceback is provided below:

    `Caused by op u'gradients/seq2seq/encoder/bidirectional_rnn/fw/fw/TensorArrayUnstack/TensorArrayScatter/TensorArrayScatterV3_grad/TensorArrayGatherV3', defined at: File "run_summarization.py", line 264, in tf.app.run() File "/opt/neuralnetworks/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "run_summarization.py", line 250, in main setup_training(model, batcher) File "run_summarization.py", line 111, in setup_training model.build_graph() # build the graph File "/local/s1742159/neuralnetworks2017/pointer-generator/model.py", line 298, in build_graph self._add_train_op() File "/local/s1742159/neuralnetworks2017/pointer-generator/model.py", line 274, in _add_train_op gradients = tf.gradients(loss_to_minimize, tvars, aggregation_method=tf.AggregationMethod.EXPERIMENTAL_TREE) File "/opt/neuralnetworks/lib/python2.7/site-packages/tensorflow/python/ops/gradients_impl.py", line 560, in gradients grad_scope, op, func_call, lambda: grad_fn(op, *out_grads)) File "/opt/neuralnetworks/lib/python2.7/site-packages/tensorflow/python/ops/gradients_impl.py", line 368, in _MaybeCompile return grad_fn() # Exit early File "/opt/neuralnetworks/lib/python2.7/site-packages/tensorflow/python/ops/gradients_impl.py", line 560, in grad_scope, op, func_call, lambda: grad_fn(op, *out_grads)) File "/opt/neuralnetworks/lib/python2.7/site-packages/tensorflow/python/ops/tensor_array_grad.py", line 186, in _TensorArrayScatterGrad grad = g.gather(indices) File "/opt/neuralnetworks/lib/python2.7/site-packages/tensorflow/python/ops/tensor_array_ops.py", line 328, in gather element_shape=element_shape) File "/opt/neuralnetworks/lib/python2.7/site-packages/tensorflow/python/ops/gen_data_flow_ops.py", line 2244, in _tensor_array_gather_v3 element_shape=element_shape, name=name) File "/opt/neuralnetworks/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 768, in apply_op op_def=op_def) File "/opt/neuralnetworks/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2336, in create_op original_op=self._default_original_op, op_def=op_def) File "/opt/neuralnetworks/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1228, in init self._traceback = _extract_stack()

    ...which was originally created as op u'seq2seq/encoder/bidirectional_rnn/fw/fw/TensorArrayUnstack/TensorArrayScatter/TensorArrayScatterV3', defined at: File "run_summarization.py", line 264, in tf.app.run() [elided 2 identical lines from previous traceback] File "run_summarization.py", line 111, in setup_training model.build_graph() # build the graph File "/local/s1742159/neuralnetworks2017/pointer-generator/model.py", line 295, in build_graph self._add_seq2seq() File "/local/s1742159/neuralnetworks2017/pointer-generator/model.py", line 199, in _add_seq2seq enc_outputs, fw_st, bw_st = self._add_encoder(emb_enc_inputs, self._enc_lens) File "/local/s1742159/neuralnetworks2017/pointer-generator/model.py", line 74, in add_encoder (encoder_outputs, (fw_st, bw_st)) = tf.nn.bidirectional_dynamic_rnn(cell_fw, cell_bw, encoder_inputs, dtype=tf.float32, sequence_length=seq_len, swap_memory=True) File "/opt/neuralnetworks/lib/python2.7/site-packages/tensorflow/python/ops/rnn.py", line 350, in bidirectional_dynamic_rnn time_major=time_major, scope=fw_scope) File "/opt/neuralnetworks/lib/python2.7/site-packages/tensorflow/python/ops/rnn.py", line 553, in dynamic_rnn dtype=dtype) File "/opt/neuralnetworks/lib/python2.7/site-packages/tensorflow/python/ops/rnn.py", line 671, in dynamic_rnn_loop for ta, input in zip(input_ta, flat_input)) File "/opt/neuralnetworks/lib/python2.7/site-packages/tensorflow/python/ops/rnn.py", line 671, in for ta, input in zip(input_ta, flat_input)) File "/opt/neuralnetworks/lib/python2.7/site-packages/tensorflow/python/ops/tensor_array_ops.py", line 381, in unstack indices=math_ops.range(0, num_elements), value=value, name=name) File "/opt/neuralnetworks/lib/python2.7/site-packages/tensorflow/python/ops/tensor_array_ops.py", line 409, in scatter name=name) File "/opt/neuralnetworks/lib/python2.7/site-packages/tensorflow/python/ops/gen_data_flow_ops.py", line 2510, in _tensor_array_scatter_v3 name=name) File "/opt/neuralnetworks/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 768, in apply_op op_def=op_def)

    UnimplementedError (see above for traceback): TensorArray has size zero, but element shape is not fully defined. Currently only static shapes are supported when packing zero-size TensorArrays. [[Node: gradients/seq2seq/encoder/bidirectional_rnn/fw/fw/TensorArrayUnstack/TensorArrayScatter/TensorArrayScatterV3_grad/TensorArrayGatherV3 = TensorArrayGatherV3[_class=["loc:@seq2seq/encoder/bidirectional_rnn/fw/fw/TensorArray_1"], dtype=DT_FLOAT, element_shape=, _device="/job:localhost/replica:0/task:0/gpu:0"](gradients/seq2seq/encoder/bidirectional_rnn/fw/fw/TensorArrayUnstack/TensorArrayScatter/TensorArrayScatterV3_grad/TensorArrayGrad/TensorArrayGradV3, seq2seq/encoder/bidirectional_rnn/fw/fw/TensorArrayUnstack/range, gradients/seq2seq/encoder/bidirectional_rnn/fw/fw/TensorArrayUnstack/TensorArrayScatter/TensorArrayScatterV3_grad/TensorArrayGrad/gradient_flow)]] [[Node: seq2seq/output_projection/Softmax_24/_3175 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_5326_seq2seq/output_projection/Softmax_24", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]]`

    bug 
    opened by RafaelMostert 11
  • Efficiently generate summaries from new data?

    Efficiently generate summaries from new data?

    How do folks typically generate summaries given i) new text files (only content, no headlines/abstracts/urls) and ii) the pretrained model?

    It seems that make_datafiles.py from the cnn-dailymail dir needs to be modified (e.g. removing much of the hardcoding). After these modifications, make_datafiles.py may be used to tokenize and chunk the new data. From there, we "decode" using the pretrained model to generate new summaries.

    Is the above generally correct or is there a more efficient method?

    opened by ibarrien 9
  • Mode must be train/eval/decode

    Mode must be train/eval/decode

    python run_summarization.py --mode=train --data_path=../misc/finished_files/train.bin --vocab_path=../misc/finished_files/vocab --log_root=./log --exp_name=first_test my command is "--mode=train"

    I am getting an error as

    tf.app.run() File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 126, in run _sys.exit(main(argv)) File "run_summarization.py", line 301, in main raise ValueError("The 'mode' flag must be one of train/eval/decode") ValueError: The 'mode' flag must be one of train/eval/decode

    opened by Adityagrao 8
  • [Potential] error in computing attention and final word probability

    [Potential] error in computing attention and final word probability

    Hi,

    In the attention_decoder.py, you compute attention scores over the encoder states from L92 - L103.

    I can't seem to find where you've masked the encoder features of size - (batch_size,attn_length, attention_vec_size). Without masking, the computed attention score will be incorrect. It will assign values to encoder states that correspond to PAD tokens upon taking softmax.

    Also, the word probability over the extended vocabulary is computed using tf.scatter_nd. This does not work in case of duplicate indices as the output will be non-deterministic and not summed up.. So, if the source text had more than occurence of an OOV word then it will become a problem.

    bug 
    opened by hashbangCoder 7
  • Problem of the word embedding of OOV words of decoder inputs

    Problem of the word embedding of OOV words of decoder inputs

    Hi, I have some difficulties on understanding your code, in particular, the word embedding of the decoder input. As for the emb_enc_inputs, it uses the source input with all the OOV words replaced by UNK tokens and that makes sense for me. But the emb_dec_inputs uses the target input with all the OOV words has a temporary id. Why do you use that for embedding_lookup? Please correct me if I had made any mistake. Thanks in advance!

    opened by ghost 6
  • Model producing largely extractive summaries

    Model producing largely extractive summaries

    After much work I was finally able to get the model to train successfully (220k iterations, 400/100 max enc/dec steps, 3k iterations with coverage bringing the coverage loss to 0.2).

    I'm noticing when running the model (with beam size = 4) on my own data (financial news) the decoded summaries are almost entirely extractive. What could be causing this? I've left the vocab size at the default size of 50k. Is it possible that this is a result of having too small of a vocab?

    Example (bold added to source to show extracted regions):

    Decoded

    air line pilots association said friday that 79 % of voting aviators approved a deal running through january 2019 that the union said provided industry-leading pay and benefits .

    investors are closely watching the outcome of contract talks involving other united labor groups and at other carriers , concerned about a repeat of previous industry cycles when market conditions deteriorated .

    pilots at delta air lines inc and southwest airlines co. both rejected proposed deals last year . flight attendants have been unable to reach agreement on a joint contract even though they have been bargaining since 2012 .

    Source

    Pilots at United Continental Holdings Inc. overwhelmingly approved a two-year contract extension, continuing the momentum of the airline's efforts to restore labor peace and complete the integration of staff following its creation in a 2010 merger. The Air Line Pilots Association said Friday that 79% of voting aviators approved a deal running through January 2019 that the union said provided industry-leading pay and benefits. Investors are closely watching the outcome of contract talks involving other United labor groups and at other carriers, concerned about a repeat of previous industry cycles when record profits resulted in more generous deals that hobbled carriers' finances when market conditions deteriorated. Pilots at Delta Air Lines Inc and Southwest Airlines Co. both rejected proposed deals last year. The United contract provides for higher pay, restores benefits for previously furloughed pilots, and enhances scheduling rules for long-haul flights, according to the pilot union. Both sides declined to provide contract details, though a person familiar with the situation said it included a 13% pay rise this year followed by a 3% increase in 2017 and 2% in 2018. United executives said this week that the pilot contract and a new deal being considered by its technicians would raise its unit costs excluding fuel by 2.5 percentage points this year compared with 2015. The airline forecast its unit costs this year excluding fuel and the two labor deals would rise by between 0.5% and 1.5%. The airline has suffered from rocky labor relations since its 2010 merger with Continental Airlines and has been trying to smooth that friction under new management led by Chief Executive Oscar Munoz that took over in September. United has fresh deals or tentative agreements with the majority of its unionized staff, with the results on a new contract for its mechanics due to be revealed next week, but the toughest challenge remains securing a pact with flight attendants. Flight attendants have been unable to reach agreement on a joint contract even though they have been bargaining since 2012. United presented fresh proposals covering pay and work practices last week in talks brokered by federal mediators. Flight attendants this week held a world-wide protest in pursuit of a joint contract. The airline in November reached a deal to start negotiations with the International Association of Machinists union more than a year before the contract covering 30,000 ramp workers, customer-service agents and reservation staff opens for renewal at the end of 2016. It also reached a new proposed joint labor agreement for its 9,000 mechanics, with the International Brotherhood of Teamsters union due to issue the results on Jan. 25. Pilots at Delta, its only union-represented labor group, resumed contract talks last month after rejecting a deal endorsed by union leaders. Southwest pilots turned down a proposed deal in November, with flight attendants having rejected a tentative new pact earlier in the year.

    question 
    opened by hate5six 6
  • Do anyone wanna try to add this model to transformer?

    Do anyone wanna try to add this model to transformer?

    I am try to add this model to transformer, if anyone wanna try it, please contract me, we can discuss with each other, my wechat is 15025700935, my e-mail is [email protected]

    opened by xiongma 5
  • Fail to decode any meaningful output.

    Fail to decode any meaningful output.

    I have trained the model for about 80,000 iterations and the loss has decreased to about 0.000004~, which is really low. But when I run the model in decode mode, the output is just what what what what... or why why why why.... I am using the quora data set here for training. There are about 150,000 pairs duplicate questions. I wonder do you have similar experience in your training and how do you fix it? Thanks!

    question 
    opened by ghost 5
  • Training speed

    Training speed

    Hello, thank you for your work. With the default settings on a 1080 and TF 1.0, i'm getting about 13 secs per size 16 batch, which would mean 1 epoch takes about 3 days, which is clearly off. Do you any ideas what may be causing the slowdown?

    question 
    opened by bugtig 5
  • Module Queue not found

    Module Queue not found

    Traceback (most recent call last): File "run_summarization.py", line 11, in from batcher import Batcher File "/content/drive/MyDrive/pointer-generator-master/batcher.py", line 4, in import Queue ModuleNotFoundError: No module named 'Queue'

    I am getting this error while running decode mode

    opened by kanushi19 0
  • How to fine-tune pre-trained model on a smaller dataset?

    How to fine-tune pre-trained model on a smaller dataset?

    I was wondering how it is possible to fine-tune the pre-trained model on a smaller dataset? What about the implementation of coverage mechanism during the fine-tuning? Do you propose specific settings for hyperparameters (learning rate for example) and the number of iterations?

    opened by alsbhn 0
  • DuplicateFlagError: The flag 'data_path' is defined twice. First from run_summarization.py, Second from run_summarization.py.  Description from first occurrence: Path expression to tf.Example datafiles. Can include wildcards to access multiple datafiles.

    DuplicateFlagError: The flag 'data_path' is defined twice. First from run_summarization.py, Second from run_summarization.py. Description from first occurrence: Path expression to tf.Example datafiles. Can include wildcards to access multiple datafiles.

    Currently I am doing a project on abstractive text summarization with a pointer generator network. I am trying to run the code from the following github repository,link is given below using python 3.7.3 tensorflow version 1.14.0 But getting a problem "DuplicateFlagError: The flag 'data_path' is defined twice. First from run_summarization.py, Second from run_summarization.py. Description from first occurrence: Path expression to tf.Example datafiles. Can include wildcards to access multiple datafiles"

    Can you please help me to resolve this issue. It is very urgent. please reply.

    image

    opened by tohidarehman1988 0
  • why decoder produce same generated summary ?

    why decoder produce same generated summary ?

    The model had a loss of 5 and there have always the same generated summary (for all articles ) :

    [UNK] [UNK] , 28 , has been charged with two counts of first-degree murder . he has been charged with two counts of attempted murder . he was sentenced to 15 years in prison and sentenced to 18 months in prison .

    opened by PH-github95 4
Owner
Abi See
Stanford PhD student in Natural Language Processing
Abi See
null 190 Jan 3, 2023
Pointer networks Tensorflow2

Pointer networks Tensorflow2 原文:https://arxiv.org/abs/1506.03134 仅供参考与学习,内含代码备注 环境 tensorflow==2.6.0 tqdm matplotlib numpy 《pointer networks》阅读笔记 应用场景

HUANG HAO 7 Oct 27, 2022
Implementation of the "Point 4D Transformer Networks for Spatio-Temporal Modeling in Point Cloud Videos" paper.

Point 4D Transformer Networks for Spatio-Temporal Modeling in Point Cloud Videos Introduction Point cloud videos exhibit irregularities and lack of or

Hehe Fan 101 Dec 29, 2022
This program writes christmas wish programmatically. It is using turtle as a pen pointer draw christmas trees and stars.

Introduction This is a simple program is written in python and turtle library. The objective of this program is to wish merry Christmas programmatical

Gunarakulan Gunaretnam 1 Dec 25, 2021
PyTorch version of the paper 'Enhanced Deep Residual Networks for Single Image Super-Resolution' (CVPRW 2017)

About PyTorch 1.2.0 Now the master branch supports PyTorch 1.2.0 by default. Due to the serious version problem (especially torch.utils.data.dataloade

Sanghyun Son 2.1k Jan 1, 2023
Style-based Point Generator with Adversarial Rendering for Point Cloud Completion (CVPR 2021)

Style-based Point Generator with Adversarial Rendering for Point Cloud Completion (CVPR 2021) An efficient PyTorch library for Point Cloud Completion.

Microsoft 119 Jan 2, 2023
Code for the USENIX 2017 paper: kAFL: Hardware-Assisted Feedback Fuzzing for OS Kernels

kAFL: Hardware-Assisted Feedback Fuzzing for OS Kernels Blazing fast x86-64 VM kernel fuzzing framework with performant VM reloads for Linux, MacOS an

Chair for Sys­tems Se­cu­ri­ty 541 Nov 27, 2022
Fader Networks: Manipulating Images by Sliding Attributes - NIPS 2017

FaderNetworks PyTorch implementation of Fader Networks (NIPS 2017). Fader Networks can generate different realistic versions of images by modifying at

Facebook Research 753 Dec 23, 2022
Oriented Response Networks, in CVPR 2017

Oriented Response Networks [Home] [Project] [Paper] [Supp] [Poster] Torch Implementation The torch branch contains: the official torch implementation

ZhouYanzhao 217 Dec 12, 2022
Improving Convolutional Networks via Attention Transfer (ICLR 2017)

Attention Transfer PyTorch code for "Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Tran

Sergey Zagoruyko 1.4k Dec 23, 2022
Implementation of the "PSTNet: Point Spatio-Temporal Convolution on Point Cloud Sequences" paper.

PSTNet: Point Spatio-Temporal Convolution on Point Cloud Sequences Introduction Point cloud sequences are irregular and unordered in the spatial dimen

Hehe Fan 63 Dec 9, 2022
Code for "PV-RAFT: Point-Voxel Correlation Fields for Scene Flow Estimation of Point Clouds", CVPR 2021

PV-RAFT This repository contains the PyTorch implementation for paper "PV-RAFT: Point-Voxel Correlation Fields for Scene Flow Estimation of Point Clou

Yi Wei 43 Dec 5, 2022
The PyTorch improved version of TPAMI 2017 paper: Face Alignment in Full Pose Range: A 3D Total Solution.

Face Alignment in Full Pose Range: A 3D Total Solution By Jianzhu Guo. [Updates] 2020.8.30: The pre-trained model and code of ECCV-20 are made public

Jianzhu Guo 3.4k Jan 2, 2023
PyTorch implementation of NIPS 2017 paper Dynamic Routing Between Capsules

Dynamic Routing Between Capsules - PyTorch implementation PyTorch implementation of NIPS 2017 paper Dynamic Routing Between Capsules from Sara Sabour,

Adam Bielski 475 Dec 24, 2022
Implementation of EMNLP 2017 Paper "Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog" using PyTorch and ParlAI

Language Emergence in Multi Agent Dialog Code for the Paper Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog Satwik Kottur, José M.

Karan Desai 105 Nov 25, 2022
A PyTorch implementation of the paper "Semantic Image Synthesis via Adversarial Learning" in ICCV 2017

Semantic Image Synthesis via Adversarial Learning This is a PyTorch implementation of the paper Semantic Image Synthesis via Adversarial Learning. Req

Seonghyeon Nam 146 Nov 25, 2022
Implementation supporting the ICCV 2017 paper "GANs for Biological Image Synthesis"

GANs for Biological Image Synthesis This codes implements the ICCV-2017 paper "GANs for Biological Image Synthesis". The paper and its supplementary m

Anton Osokin 95 Nov 25, 2022
Implementation of EMNLP 2017 Paper "Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog" using PyTorch and ParlAI

Language Emergence in Multi Agent Dialog Code for the Paper Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog Satwik Kottur, José M.

Karan Desai 105 Nov 25, 2022
Code and data for ACL2021 paper Cross-Lingual Abstractive Summarization with Limited Parallel Resources.

Multi-Task Framework for Cross-Lingual Abstractive Summarization (MCLAS) The code for ACL2021 paper Cross-Lingual Abstractive Summarization with Limit

Yu Bai 43 Nov 7, 2022