Deal or No Deal? End-to-End Learning for Negotiation Dialogues

Overview

Introduction

This is a PyTorch implementation of the following research papers:

The code is developed by Facebook AI Research.

The code trains neural networks to hold negotiations in natural language, and allows reinforcement learning self play and rollout-based planning.

Citation

If you want to use this code in your research, please cite:

@inproceedings{DBLP:conf/icml/YaratsL18,
  author    = {Denis Yarats and
               Mike Lewis},
  title     = {Hierarchical Text Generation and Planning for Strategic Dialogue},
  booktitle = {Proceedings of the 35th International Conference on Machine Learning,
               {ICML} 2018, Stockholmsm{\"{a}}ssan, Stockholm, Sweden, July
               10-15, 2018},
  pages     = {5587--5595},
  year      = {2018},
  crossref  = {DBLP:conf/icml/2018},
  url       = {http://proceedings.mlr.press/v80/yarats18a.html},
  timestamp = {Fri, 13 Jul 2018 14:58:25 +0200},
  biburl    = {https://dblp.org/rec/bib/conf/icml/YaratsL18},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

Dataset

We release our dataset together with the code, you can find it under data/negotiate. This dataset consists of 5808 dialogues, based on 2236 unique scenarios. Take a look at §2.3 of the paper to learn about data collection.

Each dialogue is converted into two training examples in the dataset, showing the complete conversation from the perspective of each agent. The perspectives differ on their input goals, output choice, and in special tokens marking whether a statement was read or written. See §3.1 for the details on data representation.

# Perspective of Agent 1
<input> 1 4 4 1 1 2 </input>
<dialogue> THEM: i would like 4 hats and you can have the rest . <eos> YOU: deal <eos> THEM: <selection> </dialogue>
<output> item0=1 item1=0 item2=1 item0=0 item1=4 item2=0 </output> 
<partner_input> 1 0 4 2 1 2 </partner_input>

# Perspective of Agent 2
<input> 1 0 4 2 1 2 </input>
<dialogue> YOU: i would like 4 hats and you can have the rest . <eos> THEM: deal <eos> YOU: <selection> </dialogue>
<output> item0=0 item1=4 item2=0 item0=1 item1=0 item2=1 </output>
<partner_input> 1 4 4 1 1 2 </partner_input>

Setup

All code was developed with Python 3.0 on CentOS Linux 7, and tested on Ubuntu 16.04. In addition, we used PyTorch 1.0.0, CUDA 9.0, and Visdom 0.1.8.4.

We recommend to use Anaconda. In order to set up a working environment follow the steps below:

# Install anaconda
conda create -n py30 python=3 anaconda
# Activate environment
source activate py30
# Install PyTorch
conda install pytorch torchvision cuda90 -c pytorch
# Install Visdom if you want to use visualization
pip install visdom

Usage

Supervised Training

Action Classifier

We use an action classifier to compare performance of various models. The action classifier is described in section 3 of (2). It can be trained by running the following command:

python train.py \
--cuda \
--bsz 16 \
--clip 2.0 \
--decay_every 1 \
--decay_rate 5.0 \
--domain object_division \
--dropout 0.1 \
--init_range 0.2 \
--lr 0.001 \
--max_epoch 7 \
--min_lr 1e-05 \
--model_type selection_model \
--momentum 0.1 \
--nembed_ctx 128 \
--nembed_word 128 \
--nhid_attn 128 \
--nhid_ctx 64 \
--nhid_lang 128 \
--nhid_sel 128 \
--nhid_strat 256 \
--unk_threshold 20 \
--skip_values \
--sep_sel \
--model_file selection_model.th

Baseline RNN Model

This is the baseline RNN model that we describe in (1):

python train.py \
--cuda \
--bsz 16 \
--clip 0.5 \
--decay_every 1 \
--decay_rate 5.0 \
--domain object_division \
--dropout 0.1 \
--model_type rnn_model \
--init_range 0.2 \
--lr 0.001 \
--max_epoch 30 \
--min_lr 1e-07 \
--momentum 0.1 \
--nembed_ctx 64 \
--nembed_word 256 \
--nhid_attn 64 \
--nhid_ctx 64 \
--nhid_lang 128 \
--nhid_sel 128 \
--sel_weight 0.6 \
--unk_threshold 20 \
--sep_sel \
--model_file rnn_model.th

Hierarchical Latent Model

In this section we provide guidelines on how to train the hierarchical latent model from (2). The final model requires two sub-models: the clustering model, which learns compact representations over intents; and the language model, which translates intent representations into language. Please read sections 5 and 6 of (2) for more details.

Clustering Model

python train.py \
--cuda \
--bsz 16 \
--clip 2.0 \
--decay_every 1 \
--decay_rate 5.0 \
--domain object_division \
--dropout 0.2 \
--init_range 0.3 \
--lr 0.001 \
--max_epoch 15 \
--min_lr 1e-05 \
--model_type latent_clustering_model \
--momentum 0.1 \
--nembed_ctx 64 \
--nembed_word 256 \
--nhid_ctx 64 \
--nhid_lang 256 \
--nhid_sel 128 \
--nhid_strat 256 \
--unk_threshold 20 \
--num_clusters 50 \
--sep_sel \
--skip_values \
--nhid_cluster 256 \
--selection_model_file selection_model.th \
--model_file clustering_model.th

Language Model

python train.py \
--cuda \
--bsz 16 \
--clip 2.0 \
--decay_every 1 \
--decay_rate 5.0 \
--domain object_division \
--dropout 0.1 \
--init_range 0.2 \
--lr 0.001 \
--max_epoch 15 \
--min_lr 1e-05 \
--model_type latent_clustering_language_model \
--momentum 0.1 \
--nembed_ctx 64 \
--nembed_word 256 \
--nhid_ctx 64 \
--nhid_lang 256 \
--nhid_sel 128 \
--nhid_strat 256 \
--unk_threshold 20 \
--num_clusters 50 \
--sep_sel \
--nhid_cluster 256 \
--skip_values \
--selection_model_file selection_model.th \
--cluster_model_file clustering_model.th \
--model_file clustering_language_model.th

Full Model

python train.py \
--cuda \
--bsz 16 \
--clip 2.0 \
--decay_every 1 \
--decay_rate 5.0 \
--domain object_division \
--dropout 0.2 \
--init_range 0.3 \
--lr 0.001 \
--max_epoch 10 \
--min_lr 1e-05 \
--model_type latent_clustering_prediction_model \
--momentum 0.2 \
--nembed_ctx 64 \
--nembed_word 256 \
--nhid_ctx 64 \
--nhid_lang 256 \
--nhid_sel 128 \
--nhid_strat 256 \
--unk_threshold 20 \
--num_clusters 50 \
--sep_sel \
--selection_model_file selection_model.th \
--lang_model_file clustering_language_model.th \
--model_file full_model.th

Selfplay

If you want to have two pretrained models to negotiate against each another, use selfplay.py. For example, lets have two rnn models to play against each other:

python selfplay.py \
--cuda \
--alice_model_file rnn_model.th \
--bob_model_file rnn_model.th \
--context_file data/negotiate/selfplay.txt  \
--temperature 0.5 \
--selection_model_file selection_model.th

The script will output generated dialogues, as well as some statistics. For example:

================================================================================
Alice : book=(count:3 value:1) hat=(count:1 value:5) ball=(count:1 value:2)
Bob   : book=(count:3 value:1) hat=(count:1 value:1) ball=(count:1 value:6)
--------------------------------------------------------------------------------
Alice : i would like the hat and the ball . <eos>
Bob   : i need the ball and the hat <eos>
Alice : i can give you the ball and one book . <eos>
Bob   : i can't make a deal without the ball <eos>
Alice : okay then i will take the hat and the ball <eos>
Bob   : okay , that's fine . <eos>
Alice : <selection>
Alice : book=0 hat=1 ball=1 book=3 hat=0 ball=0
Bob   : book=3 hat=0 ball=0 book=0 hat=1 ball=1
--------------------------------------------------------------------------------
Agreement!
Alice : 7 points
Bob   : 3 points
--------------------------------------------------------------------------------
dialog_len=4.47 sent_len=6.93 agree=86.67% advantage=3.14 time=2.069s comb_rew=10.93 alice_rew=6.93 alice_sel=60.00% alice_unique=26 bob_rew=4.00 bob_sel=40.00% bob_unique=25 full_match=0.78 
--------------------------------------------------------------------------------
debug: 3 1 1 5 1 2 item0=0 item1=1 item2=1
debug: 3 1 1 1 1 6 item0=3 item1=0 item2=0
================================================================================

Reinforcement Learning

To fine-tune a pretrained model with RL use the reinforce.py script:

python reinforce.py \
--cuda \
--alice_model_file rnn_model.th \
--bob_model_file rnn_model.th \
--output_model_file rnn_rl_model.th \
--context_file data/negotiate/selfplay.txt  \
--temperature 0.5 \
--verbose \
--log_file rnn_rl.log \
--sv_train_freq 4 \
--nepoch 4 \
--selection_model_file selection_model.th  \
--rl_lr 0.00001 \
--rl_clip 0.0001 \
--sep_sel

License

This project is licenced under CC-by-NC, see the LICENSE file for details.

Comments
  • Invalid argument 2: dimension 2 out of range of 2D tensor

    Invalid argument 2: dimension 2 out of range of 2D tensor

    Hello,

    When the code below is run a error is throwed. Could someone help me with this problem?

    ps: cuda support is disabled.

    # concatenate attention and context hidden and pass it to the selection encoder h = torch.cat([attn, ctx_h], 2).squeeze(0) h = self.dropout(h) h = self.sel_encoder.forward(h)

    python train.py data data/negotiate --bsz 16 --clip 0.5
    --decay_every 1
    --decay_rate 5.0
    --dropout 0.5
    --init_range 0.1
    --lr 1
    --max_epoch 30
    --min_lr 0.01
    --momentum 0.1
    --nembed_ctx 64
    --nembed_word 256
    --nesterov
    --nhid_attn 256
    --nhid_ctx 64
    --nhid_lang 128
    --nhid_sel 256
    --nhid_strat 128
    --sel_weight 0.5
    --model_file sv_model.th dataset data/negotiate/train.txt, total 687919, unks 8718, ratio 1.27% dataset data/negotiate/val.txt, total 74653, unks 914, ratio 1.22% dataset data/negotiate/test.txt, total 70262, unks 847, ratio 1.21% Traceback (most recent call last): File "train.py", line 105, in main() File "train.py", line 98, in main train_loss, valid_loss, select_loss = engine.train(corpus) File "/Users/diegosoaresub/Downloads/facebookIA/end-to-end-negotiator-master/src/engine.py", line 193, in train _, _, valid_select_loss = self.iter(N, epoch, lr, traindata, validdata) File "/Users/diegosoaresub/Downloads/facebookIA/end-to-end-negotiator-master/src/engine.py", line 161, in iter train_loss, train_time = self.train_pass(N, trainset) File "/Users/diegosoaresub/Downloads/facebookIA/end-to-end-negotiator-master/src/engine.py", line 104, in train_pass out, hid, tgt, sel_out, sel_tgt = Engine.forward(self.model, batch, volatile=False) File "/Users/diegosoaresub/Downloads/facebookIA/end-to-end-negotiator-master/src/engine.py", line 84, in forward sel_out = model.forward_selection(inpt, lang_h, ctx_h) File "/Users/diegosoaresub/Downloads/facebookIA/end-to-end-negotiator-master/src/models/dialog_model.py", line 178, in forward_selection h = torch.cat([attn, ctx_h], 2).squeeze(0) File "/Users/diegosoaresub/anaconda/lib/python3.6/site-packages/torch/autograd/variable.py", line 897, in cat return Concat.apply(dim, *iterable) File "/Users/diegosoaresub/anaconda/lib/python3.6/site-packages/torch/autograd/_functions/tensor.py", line 316, in forward ctx.input_sizes = [i.size(dim) for i in inputs] File "/Users/diegosoaresub/anaconda/lib/python3.6/site-packages/torch/autograd/_functions/tensor.py", line 316, in ctx.input_sizes = [i.size(dim) for i in inputs] RuntimeError: invalid argument 2: dimension 2 out of range of 2D tensor at /Users/soumith/miniconda2/conda-bld/pytorch_1502000975045/work/torch/lib/TH/generic/THTensor.c:24

    opened by diegosoaresub 8
  • training: value cannot be converted without overflow

    training: value cannot be converted without overflow

    Dear Community I have the following message during python train.py "Traceback (most recent call last): File "train.py", line 105, in main() File "train.py", line 98, in main train_loss, valid_loss, select_loss = engine.train(corpus) File "/Users/ra312/Documents/GitHub/end-to-end-negotiator/src/engine.py", line 195, in train if valid_select_loss < best_valid_select_loss: RuntimeError: value cannot be converted to type float without overflow: 10000000000000000159028911097599180468360808563945281389781327557747838772170381060813469985856815104.000000" I did as instructed. Any help is appreciated. I launched train.py in condo environment py30. In this environment, I installed two packages:PyTorch and Visdom. The operating system is MacOs High Sierra 10.13.4

    opened by ra312 6
  • Human chat with full model

    Human chat with full model

    Hi,

    I have two questions:

    1. How can I make human chat with full model version. I noticed that chat.py is disabled.
    2. The dialogues generated by full model version is very short, even shorter than original RNN version. What is the reason?

    Look forward to your reply!

    opened by rzhao1 3
  • Config has no attribute rl_temperature

    Config has no attribute rl_temperature

    Hi, when running reinforce.py I get the following error, any idea what may be causing this? Thanks!

    File "reinforce.py", line 95, in main
        parser.add_argument('--temperature', type=float, default=config.rl_temperature,
    AttributeError: module 'config' has no attribute 'rl_temperature'
    
    opened by plugimi 3
  • TypeError: expected str, bytes or os.PathLike object, not NoneType

    TypeError: expected str, bytes or os.PathLike object, not NoneType

    Hi, I'm new to artificial intelligence, machine learning and also python. I tried to implement this research as a part of my final dissertation on CentOS 7. After installing anaconda and all other tools and dependencies, when i executed "train.py" file, it took about 20 minutes, and finished fine. But then after I tried to look for where it has saved the trained model , I couldnt find it. I wanted to chat with it myself so I executed "chat.py", but it gave me an error saying "utils.py line 35 with open(file_name, 'rb') as f: TypeError: expected str, bytes or os.PathLike object, not NoneType" I dont know if I was supposed to make any changes in the code or not, please let me know if I need to and also where? Thank you.

    opened by SarimShehzad 3
  • reinforce fix: the bug with fixed random seed was corrected

    reinforce fix: the bug with fixed random seed was corrected

    Dear authors,

    Thank you for the very good paper "Deal or No Deal? End-to-End Learning for Negotiation Dialogues", I read it, and it's very interesting.

    I've tried to reproduce results with original code and parameters (https://github.com/facebookresearch/end-to-end-negotiator) but reinforce result is not the same as in the paper. I have the average score (5.95 vs. 5.58) comparing with (7.1 vs 4.2). Also, I tried to tune parameters a little bit, but it does not help. Do you have any assumptions about the reason for it?


    Dear authors,

    I've found the source of the problem, it was a very low-level effect. The random seed has been fixed (in reinforce.py line 137), however, small patches to interpretation or libraries may affect. So if I do not fix random seed and change optimization parameters a little bit I have (7.33 vs. 4.33) comparing with (5.95 vs. 5.58) in the original code, that is very close to the paper result (7.1 vs 4.2). Only one epoch for reinforce.py looks like mistype, I made 4.

    original code reinforce.py results: 4000: dialog_len=4.73 sent_len=6.50 agree=90.67% advantage=0.41 time=0.091s comb_rew=11.53 alice_rew=5.95 alice_sel=69.30% alice_unique=2586 bob_rew=5.58 bob_sel=30.70% bob_unique=2227

    reinforce.py results (4 epoch): 16000: dialog_len=4.97 sent_len=6.66 agree=90.40% advantage=1.39 time=0.103s comb_rew=11.54 alice_rew=6.39 alice_sel=72.89% alice_unique=7342 bob_rew=5.14 bob_sel=27.11% bob_unique=6521

    reinforce.py results (4 epoch, tuning of the learning rate, non-fixed random seed): 16000: dialog_len=6.30 sent_len=7.28 agree=91.00% advantage=3.29 time=0.134s comb_rew=11.66 alice_rew=7.33 alice_sel=81.16% alice_unique=4345 bob_rew=4.33 bob_sel=18.84% bob_unique=9112

    CLA Signed 
    opened by senya-ashukha 3
  • Cuda 9.0 / Pytorch 0.4.1 / Visdom 0.1.8.4 support

    Cuda 9.0 / Pytorch 0.4.1 / Visdom 0.1.8.4 support

    Hello, my GPU (Tesla V100-SXM2-16GB) requires CUDA 9.0 support to work, so I updated your project (which is awesome!) to support breaking changes from CUDA 9.0 (from 8), as well as Pytorch 0.4.1 (from 0.3), and Visdom (0.1.8.4).

    I tested these updates on an AWS P3 instance, running the deep learning Ubuntu version 12, and the outputs from the GPU match running without CUDA on the original pull.

    I added logging statements for visibility into program operation, and moved default settings from the args parameters into their own config.py file. Please feel free to review / comment / merge if you'd like, and I'd be happy to make any changes if you'd like. Hope you find it useful. Thanks again! -Alex

    Training example

    2018-08-02 16:38:18,984 : INFO : train.py : Starting training using pytorch version:0.4.1 2018-08-02 16:38:18,987 : INFO : train.py : CUDA is enabled. Using device_id:0 version:9.0.176 on gpu:Tesla V100-SXM2-16GB 2018-08-02 16:38:18,987 : INFO : train.py : Building word corpus, requiring minimum word frequency of 20 for dictionary 2018-08-02 16:38:19,732 : INFO : data.py : dataset data/negotiate/train.txt, total 687919, unks 8718, ratio 1.27% 2018-08-02 16:38:19,770 : INFO : data.py : dataset data/negotiate/val.txt, total 74653, unks 914, ratio 1.22% 2018-08-02 16:38:19,807 : INFO : data.py : dataset data/negotiate/test.txt, total 70262, unks 847, ratio 1.21% 2018-08-02 16:38:19,808 : INFO : train.py : Building RNN-based dialogue model from word corpus 2018-08-02 16:38:25,319 : INFO : train.py : Training model 2018-08-02 16:38:36,093 : INFO : engine.py : | epoch 001 | train_loss 3.587 | train_ppl 36.125 | s/epoch 10.31 | lr 1.00000000 2018-08-02 16:38:36,094 : INFO : engine.py : | epoch 001 | valid_loss 2.636 | valid_ppl 13.952 2018-08-02 16:38:36,094 : INFO : engine.py : | epoch 001 | valid_select_loss 1.232 | valid_select_ppl 3.427 2018-08-02 16:38:46,927 : INFO : engine.py : | epoch 002 | train_loss 2.862 | train_ppl 17.505 | s/epoch 10.33 | lr 1.00000000 2018-08-02 16:38:46,927 : INFO : engine.py : | epoch 002 | valid_loss 2.293 | valid_ppl 9.904 2018-08-02 16:38:46,928 : INFO : engine.py : | epoch 002 | valid_select_loss 1.195 | valid_select_ppl 3.305 2018-08-02 16:38:57,761 : INFO : engine.py : | epoch 003 | train_loss 2.643 | train_ppl 14.061 | s/epoch 10.36 | lr 1.00000000 2018-08-02 16:38:57,761 : INFO : engine.py : | epoch 003 | valid_loss 2.153 | valid_ppl 8.607 2018-08-02 16:38:57,761 : INFO : engine.py : | epoch 003 | valid_select_loss 1.002 | valid_select_ppl 2.723 2018-08-02 16:39:08,705 : INFO : engine.py : | epoch 004 | train_loss 2.392 | train_ppl 10.938 | s/epoch 10.47 | lr 1.00000000 2018-08-02 16:39:08,705 : INFO : engine.py : | epoch 004 | valid_loss 2.070 | valid_ppl 7.925 2018-08-02 16:39:08,705 : INFO : engine.py : | epoch 004 | valid_select_loss 0.692 | valid_select_ppl 1.997 2018-08-02 16:39:19,554 : INFO : engine.py : | epoch 005 | train_loss 2.253 | train_ppl 9.512 | s/epoch 10.37 | lr 1.00000000 2018-08-02 16:39:19,554 : INFO : engine.py : | epoch 005 | valid_loss 2.053 | valid_ppl 7.792 2018-08-02 16:39:19,554 : INFO : engine.py : | epoch 005 | valid_select_loss 0.580 | valid_select_ppl 1.785 2018-08-02 16:39:30,437 : INFO : engine.py : | epoch 006 | train_loss 2.174 | train_ppl 8.790 | s/epoch 10.40 | lr 1.00000000 2018-08-02 16:39:30,437 : INFO : engine.py : | epoch 006 | valid_loss 2.013 | valid_ppl 7.484 2018-08-02 16:39:30,437 : INFO : engine.py : | epoch 006 | valid_select_loss 0.533 | valid_select_ppl 1.704 2018-08-02 16:39:41,434 : INFO : engine.py : | epoch 007 | train_loss 2.081 | train_ppl 8.014 | s/epoch 10.52 | lr 1.00000000 2018-08-02 16:39:41,434 : INFO : engine.py : | epoch 007 | valid_loss 1.906 | valid_ppl 6.723 2018-08-02 16:39:41,434 : INFO : engine.py : | epoch 007 | valid_select_loss 0.465 | valid_select_ppl 1.591 2018-08-02 16:39:52,328 : INFO : engine.py : | epoch 008 | train_loss 2.009 | train_ppl 7.454 | s/epoch 10.41 | lr 1.00000000 2018-08-02 16:39:52,328 : INFO : engine.py : | epoch 008 | valid_loss 1.895 | valid_ppl 6.655 2018-08-02 16:39:52,328 : INFO : engine.py : | epoch 008 | valid_select_loss 0.450 | valid_select_ppl 1.569 2018-08-02 16:40:03,231 : INFO : engine.py : | epoch 009 | train_loss 1.973 | train_ppl 7.190 | s/epoch 10.43 | lr 1.00000000 2018-08-02 16:40:03,231 : INFO : engine.py : | epoch 009 | valid_loss 1.898 | valid_ppl 6.674 2018-08-02 16:40:03,231 : INFO : engine.py : | epoch 009 | valid_select_loss 0.444 | valid_select_ppl 1.559 2018-08-02 16:40:14,271 : INFO : engine.py : | epoch 010 | train_loss 1.935 | train_ppl 6.924 | s/epoch 10.56 | lr 1.00000000 2018-08-02 16:40:14,271 : INFO : engine.py : | epoch 010 | valid_loss 1.852 | valid_ppl 6.375 2018-08-02 16:40:14,271 : INFO : engine.py : | epoch 010 | valid_select_loss 0.395 | valid_select_ppl 1.484 2018-08-02 16:40:25,199 : INFO : engine.py : | epoch 011 | train_loss 1.905 | train_ppl 6.722 | s/epoch 10.45 | lr 1.00000000 2018-08-02 16:40:25,199 : INFO : engine.py : | epoch 011 | valid_loss 1.843 | valid_ppl 6.315 2018-08-02 16:40:25,199 : INFO : engine.py : | epoch 011 | valid_select_loss 0.408 | valid_select_ppl 1.503 2018-08-02 16:40:36,132 : INFO : engine.py : | epoch 012 | train_loss 1.874 | train_ppl 6.516 | s/epoch 10.46 | lr 1.00000000 2018-08-02 16:40:36,132 : INFO : engine.py : | epoch 012 | valid_loss 1.843 | valid_ppl 6.318 2018-08-02 16:40:36,132 : INFO : engine.py : | epoch 012 | valid_select_loss 0.346 | valid_select_ppl 1.413 2018-08-02 16:40:47,090 : INFO : engine.py : | epoch 013 | train_loss 1.850 | train_ppl 6.357 | s/epoch 10.48 | lr 1.00000000 2018-08-02 16:40:47,090 : INFO : engine.py : | epoch 013 | valid_loss 1.830 | valid_ppl 6.237 2018-08-02 16:40:47,090 : INFO : engine.py : | epoch 013 | valid_select_loss 0.322 | valid_select_ppl 1.380 2018-08-02 16:40:57,909 : INFO : engine.py : | epoch 014 | train_loss 1.822 | train_ppl 6.183 | s/epoch 10.35 | lr 1.00000000 2018-08-02 16:40:57,910 : INFO : engine.py : | epoch 014 | valid_loss 1.899 | valid_ppl 6.679 2018-08-02 16:40:57,910 : INFO : engine.py : | epoch 014 | valid_select_loss 0.285 | valid_select_ppl 1.329 2018-08-02 16:41:08,608 : INFO : engine.py : | epoch 015 | train_loss 1.796 | train_ppl 6.026 | s/epoch 10.23 | lr 1.00000000 2018-08-02 16:41:08,608 : INFO : engine.py : | epoch 015 | valid_loss 1.854 | valid_ppl 6.387 2018-08-02 16:41:08,608 : INFO : engine.py : | epoch 015 | valid_select_loss 0.344 | valid_select_ppl 1.410 2018-08-02 16:41:19,258 : INFO : engine.py : | epoch 016 | train_loss 1.770 | train_ppl 5.870 | s/epoch 10.19 | lr 1.00000000 2018-08-02 16:41:19,258 : INFO : engine.py : | epoch 016 | valid_loss 1.832 | valid_ppl 6.247 2018-08-02 16:41:19,258 : INFO : engine.py : | epoch 016 | valid_select_loss 0.264 | valid_select_ppl 1.303 2018-08-02 16:41:30,106 : INFO : engine.py : | epoch 017 | train_loss 1.748 | train_ppl 5.746 | s/epoch 10.38 | lr 1.00000000 2018-08-02 16:41:30,107 : INFO : engine.py : | epoch 017 | valid_loss 1.802 | valid_ppl 6.061 2018-08-02 16:41:30,107 : INFO : engine.py : | epoch 017 | valid_select_loss 0.218 | valid_select_ppl 1.243 2018-08-02 16:41:40,915 : INFO : engine.py : | epoch 018 | train_loss 1.732 | train_ppl 5.654 | s/epoch 10.33 | lr 1.00000000 2018-08-02 16:41:40,915 : INFO : engine.py : | epoch 018 | valid_loss 1.862 | valid_ppl 6.435 2018-08-02 16:41:40,915 : INFO : engine.py : | epoch 018 | valid_select_loss 0.220 | valid_select_ppl 1.247 2018-08-02 16:41:51,674 : INFO : engine.py : | epoch 019 | train_loss 1.712 | train_ppl 5.539 | s/epoch 10.29 | lr 1.00000000 2018-08-02 16:41:51,674 : INFO : engine.py : | epoch 019 | valid_loss 1.795 | valid_ppl 6.019 2018-08-02 16:41:51,674 : INFO : engine.py : | epoch 019 | valid_select_loss 0.247 | valid_select_ppl 1.280 2018-08-02 16:42:02,520 : INFO : engine.py : | epoch 020 | train_loss 1.702 | train_ppl 5.487 | s/epoch 10.38 | lr 1.00000000 2018-08-02 16:42:02,521 : INFO : engine.py : | epoch 020 | valid_loss 1.817 | valid_ppl 6.155 2018-08-02 16:42:02,521 : INFO : engine.py : | epoch 020 | valid_select_loss 0.201 | valid_select_ppl 1.222 2018-08-02 16:42:13,287 : INFO : engine.py : | epoch 021 | train_loss 1.687 | train_ppl 5.403 | s/epoch 10.30 | lr 1.00000000 2018-08-02 16:42:13,288 : INFO : engine.py : | epoch 021 | valid_loss 1.835 | valid_ppl 6.267 2018-08-02 16:42:13,288 : INFO : engine.py : | epoch 021 | valid_select_loss 0.219 | valid_select_ppl 1.245 2018-08-02 16:42:24,054 : INFO : engine.py : | epoch 022 | train_loss 1.673 | train_ppl 5.330 | s/epoch 10.30 | lr 1.00000000 2018-08-02 16:42:24,054 : INFO : engine.py : | epoch 022 | valid_loss 1.800 | valid_ppl 6.051 2018-08-02 16:42:24,054 : INFO : engine.py : | epoch 022 | valid_select_loss 0.207 | valid_select_ppl 1.230 2018-08-02 16:42:34,958 : INFO : engine.py : | epoch 023 | train_loss 1.660 | train_ppl 5.259 | s/epoch 10.44 | lr 1.00000000 2018-08-02 16:42:34,958 : INFO : engine.py : | epoch 023 | valid_loss 1.798 | valid_ppl 6.040 2018-08-02 16:42:34,958 : INFO : engine.py : | epoch 023 | valid_select_loss 0.200 | valid_select_ppl 1.221 2018-08-02 16:42:45,765 : INFO : engine.py : | epoch 024 | train_loss 1.649 | train_ppl 5.201 | s/epoch 10.33 | lr 1.00000000 2018-08-02 16:42:45,765 : INFO : engine.py : | epoch 024 | valid_loss 1.780 | valid_ppl 5.931 2018-08-02 16:42:45,765 : INFO : engine.py : | epoch 024 | valid_select_loss 0.191 | valid_select_ppl 1.210 2018-08-02 16:42:56,670 : INFO : engine.py : | epoch 025 | train_loss 1.639 | train_ppl 5.148 | s/epoch 10.43 | lr 1.00000000 2018-08-02 16:42:56,670 : INFO : engine.py : | epoch 025 | valid_loss 1.795 | valid_ppl 6.020 2018-08-02 16:42:56,670 : INFO : engine.py : | epoch 025 | valid_select_loss 0.170 | valid_select_ppl 1.185 2018-08-02 16:43:07,725 : INFO : engine.py : | epoch 026 | train_loss 1.629 | train_ppl 5.096 | s/epoch 10.58 | lr 1.00000000 2018-08-02 16:43:07,725 : INFO : engine.py : | epoch 026 | valid_loss 1.783 | valid_ppl 5.950 2018-08-02 16:43:07,725 : INFO : engine.py : | epoch 026 | valid_select_loss 0.164 | valid_select_ppl 1.178 2018-08-02 16:43:18,639 : INFO : engine.py : | epoch 027 | train_loss 1.619 | train_ppl 5.048 | s/epoch 10.44 | lr 1.00000000 2018-08-02 16:43:18,639 : INFO : engine.py : | epoch 027 | valid_loss 1.765 | valid_ppl 5.844 2018-08-02 16:43:18,639 : INFO : engine.py : | epoch 027 | valid_select_loss 0.203 | valid_select_ppl 1.225 2018-08-02 16:43:29,566 : INFO : engine.py : | epoch 028 | train_loss 1.605 | train_ppl 4.978 | s/epoch 10.45 | lr 1.00000000 2018-08-02 16:43:29,566 : INFO : engine.py : | epoch 028 | valid_loss 1.781 | valid_ppl 5.937 2018-08-02 16:43:29,566 : INFO : engine.py : | epoch 028 | valid_select_loss 0.178 | valid_select_ppl 1.195 2018-08-02 16:43:40,507 : INFO : engine.py : | epoch 029 | train_loss 1.596 | train_ppl 4.936 | s/epoch 10.47 | lr 1.00000000 2018-08-02 16:43:40,507 : INFO : engine.py : | epoch 029 | valid_loss 1.779 | valid_ppl 5.924 2018-08-02 16:43:40,507 : INFO : engine.py : | epoch 029 | valid_select_loss 0.202 | valid_select_ppl 1.224 2018-08-02 16:43:51,585 : INFO : engine.py : | epoch 030 | train_loss 1.586 | train_ppl 4.884 | s/epoch 10.60 | lr 1.00000000 2018-08-02 16:43:51,585 : INFO : engine.py : | epoch 030 | valid_loss 1.799 | valid_ppl 6.043 2018-08-02 16:43:51,585 : INFO : engine.py : | epoch 030 | valid_select_loss 0.144 | valid_select_ppl 1.155 2018-08-02 16:43:51,605 : INFO : engine.py : | start annealing | best validselectloss 0.144 | best validselectppl 1.155 2018-08-02 16:44:01,865 : INFO : engine.py : | epoch 031 | train_loss 1.515 | train_ppl 4.547 | s/epoch 9.80 | lr 0.20000000 2018-08-02 16:44:01,865 : INFO : engine.py : | epoch 031 | valid_loss 1.733 | valid_ppl 5.658 2018-08-02 16:44:01,865 : INFO : engine.py : | epoch 031 | valid_select_loss 0.128 | valid_select_ppl 1.137 2018-08-02 16:44:12,118 : INFO : engine.py : | epoch 032 | train_loss 1.490 | train_ppl 4.438 | s/epoch 9.78 | lr 0.04000000 2018-08-02 16:44:12,118 : INFO : engine.py : | epoch 032 | valid_loss 1.731 | valid_ppl 5.645 2018-08-02 16:44:12,118 : INFO : engine.py : | epoch 032 | valid_select_loss 0.129 | valid_select_ppl 1.137 2018-08-02 16:44:12,138 : INFO : train.py : final select_ppl 1.137

    real 5m54.212s user 5m36.176s sys 0m7.336s

    opened by zredlined 2
  • Expecting torch.Longtensor but found type torch.Inttensor

    Expecting torch.Longtensor but found type torch.Inttensor

    In Moules.py file Line no 107 and 108 only 2 arguments are passed. cnt = ctx.index_select(0, cnt_idx) val = ctx.index_select(0, val_idx)

    But as per documentation (https://pytorch.org/docs/stable/torch.html) 3 arguments are passed. I am getting below error. "RunTimeError: Expected object of type torch.Longtensor but found of type torch.Inttensor ". I try to typecast it as follows: cnt = ctx.index_select(0, torch.Longtensor(cnt_idx)). But still got the same error.

    I am using python 3.7 . So what might be the issue?

    Thanks

    opened by hegebharat 2
  • non-empty torch.LongTensor is ambiguous

    non-empty torch.LongTensor is ambiguous

    Hi, When I ran the reinforce.py, I got the following error. ** Since my machine doesn't have GPU, I ran train.py and reinforce.py without --cuda. ** I also got the same error when running selfplay.py. Thanks.

    (py30) [root@localhost src]# python reinforce.py --data data/negotiate --bsz 16 --clip 1 --context_file data/negotiate/selfplay.txt --eps 0.0 --gamma 0.95 --lr 0.5 --momentum 0.1 --nepoch 1 --nesterov --ref_text data/negotiate/train.txt --rl_clip 1 --rl_lr 0.1 --score_threshold 6 --sv_train_freq 4 --temperature 0.5 --alice_model sv_model.th --bob_model sv_model.th --output_model_file rl_model.th Traceback (most recent call last): File "reinforce.py", line 165, in main() File "reinforce.py", line 159, in main reinforce.run() File "reinforce.py", line 57, in run self.dialog.run(ctxs, self.logger) File "/root/end-to-end-negotiator-master/src/dialog.py", line 170, in run out = writer.write() File "/root/end-to-end-negotiator-master/src/agent.py", line 329, in write 100, self.args.temperature) File "/root/end-to-end-negotiator-master/src/models/dialog_model.py", line 282, in write if inpt: File "/root/anaconda3/envs/py30/lib/python3.6/site-packages/torch/autograd/variable.py", line 123, in bool torch.typename(self.data) + " is ambiguous") RuntimeError: bool value of Variable objects containing non-empty torch.LongTensor is ambiguous

    opened by uclaRaymond 2
  • pytorch0.4.1 does work for me

    pytorch0.4.1 does work for me

    pytorch0.4.1 does work for me, but pytorch1.0.0 works. when I use 0.4.1, it shows "mean is not a valid value for reduction" (it should be "elementwise_mean" in 0.4.1)

    opened by ryz1 1
  • CUDA not available. Unable to train

    CUDA not available. Unable to train

    Hi, when I run "python train.py --data data/negotiate --cuda --bsz 16 --clip 0.5 --decay_every 1 --decay_rate 5.0 --dropout 0.5 --init_range 0.1 --lr 1 --max_epoch 30 --min_lr 0.01 --momentum 0.1 --nembed_ctx 64 --nembed_word 256 --nesterov --nhid_attn 256 --nhid_ctx 64 --nhid_lang 128 --nhid_sel 256 --nhid_strat 128 --sel_weight 0.5 --model_file sv_model.th"

    I get the assertion fail error that CUDA not available. I do not have GPU on my machine. CUDA wont work due to that.

    Is there any way forward for training without GPU?

    Thanks for the help.

    opened by ishwara-bhat 1
  • test.py: could not run, it seems like the file version is not consistent with others

    test.py: could not run, it seems like the file version is not consistent with others

    File "test.py", line 62, in main out, hid, tgt, sel_out, sel_tgt = Engine.forward(model, batch)#, volatile=False) File "/home/liuchen/data/end-to-end-negotiator/src/engines/engine.py", line 78, in forward sel_tgt = Variable(sel_tgt) TypeError: Variable data has to be a tensor, but got list

    opened by ghost 2
  • Questions about arguments in reinforce script

    Questions about arguments in reinforce script

    Hello.

    I have a few questions. I would be grateful if you answer them.

    Could you please tell me what the argument is and what it affects? parser.add_argument('--smart_bob', action='store_true', default=False, help='make Bob smart again')

    In Deal or No Deal? End-to-End Learning for Negotiation Dialogues in 6.1: During reinforcement learning, we use a learning rate of 0.1, clip gradients above 1.0, and use a discount factor of γ=0.95. But in reinforce.py: parser.add_argument('--gamma', type=float, default=0.99, help='discount factor'). It matters to learning?

    Also in reinforce.py we see: 'parser.add_argument('--clip', type=float, default=0.1, help='gradient clip') In report: clip gradients above 1.0

    Reinforce learning rate and gradient clip. In script default value is:
    parser.add_argument('--rl_lr', type=float, default=0.002, help='RL learning rate') parser.add_argument('--rl_clip', type=float, default=2.0, help='RL gradient clip') In code snippet in readme file: --rl_lr 0.00001 \ --rl_clip 0.0001 \ It matters to learning?

    How long does it take to execute the script reinforce with arguments from snippet?

    During training, a large number of dialogs appear in which one of the agents repeats one word a large number of times. It' ok?

    opened by Ulitochka 2
  • Reinfrocement Learning with full_model

    Reinfrocement Learning with full_model

    Hello.

    If I use full_model.th in reinforcment learning script: python reinforce.py
    --cuda
    --alice_model_file full_model.th
    --bob_model_file full_model.th
    --output_model_file rl_model.th
    --context_file data/negotiate/selfplay.txt
    --temperature 0.5
    --verbose
    --log_file rnn_rl.log
    --sv_train_freq 4
    --nepoch 4
    --selection_model_file selection_model.th
    --rl_lr 0.00001
    --rl_clip 0.0001
    --sep_sel

    I have this:

    Traceback (most recent call last): File "/home/.../end-to-end-negotiator/src/reinforce.py", line 169, in main() File "/home/.../end-to-end-negotiator/src/reinforce.py", line 163, in main reinforce.run() File "/home/.../end-to-end-negotiator/src/reinforce.py", line 51, in run self.dialog.run(ctxs, self.logger) File "/home/.../end-to-end-negotiator/src/dialog.py", line 171, in run out = writer.write(max_words=words_left) File "/home/.../end-to-end-negotiator/src/agent.py", line 1661, in write _, lat_h, log_q_z = self.model.forward_prediction(self.cnt, self.mem_h, sample=self.train) File "/home/.../end-to-end-negotiator/src/models/latent_clustering_model.py", line 541, in forward_prediction z = q_z.multinomial().detach() TypeError: multinomial() missing 1 required positional arguments: "num_samples"

    torch==1.0.0

    opened by Ulitochka 0
Owner
Facebook Research
Facebook Research
A crowdsourced dataset of dialogues grounded in social contexts involving utilization of commonsense.

A crowdsourced dataset of dialogues grounded in social contexts involving utilization of commonsense.

Alexa 62 Dec 20, 2022
TEACh is a dataset of human-human interactive dialogues to complete tasks in a simulated household environment.

TEACh Task-driven Embodied Agents that Chat Aishwarya Padmakumar*, Jesse Thomason*, Ayush Shrivastava, Patrick Lange, Anjali Narayan-Chen, Spandana Ge

Alexa 98 Dec 9, 2022
Saptak Bhoumik 14 May 24, 2022
An open source library for deep learning end-to-end dialog systems and chatbots.

DeepPavlov is an open-source conversational AI library built on TensorFlow, Keras and PyTorch. DeepPavlov is designed for development of production re

Neural Networks and Deep Learning lab, MIPT 6k Dec 30, 2022
An open source library for deep learning end-to-end dialog systems and chatbots.

DeepPavlov is an open-source conversational AI library built on TensorFlow, Keras and PyTorch. DeepPavlov is designed for development of production re

Neural Networks and Deep Learning lab, MIPT 6k Dec 31, 2022
An open source library for deep learning end-to-end dialog systems and chatbots.

DeepPavlov is an open-source conversational AI library built on TensorFlow, Keras and PyTorch. DeepPavlov is designed for development of production re

Neural Networks and Deep Learning lab, MIPT 5k Feb 18, 2021
Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning on image-text and video-text tasks

Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning on image-text and video-text tasks. It takes raw videos/images + text as inputs, and outputs task predictions. ClipBERT is designed based on 2D CNNs and transformers, and uses a sparse sampling strategy to enable efficient end-to-end video-and-language learning.

Jie Lei 雷杰 612 Jan 4, 2023
:mag: End-to-End Framework for building natural language search interfaces to data by utilizing Transformers and the State-of-the-Art of NLP. Supporting DPR, Elasticsearch, HuggingFace’s Modelhub and much more!

Haystack is an end-to-end framework that enables you to build powerful and production-ready pipelines for different search use cases. Whether you want

deepset 1.4k Feb 18, 2021
An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"

The implementation of paper CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval. CLIP4Clip is a video-text retrieval model based

ArrowLuo 456 Jan 6, 2023
A PyTorch Implementation of End-to-End Models for Speech-to-Text

speech Speech is an open-source package to build end-to-end models for automatic speech recognition. Sequence-to-sequence models with attention, Conne

Awni Hannun 647 Dec 25, 2022
End-to-End Speech Processing Toolkit

ESPnet: end-to-end speech processing toolkit system/pytorch ver. 1.0.1 1.1.0 1.2.0 1.3.1 1.4.0 1.5.1 1.6.0 1.7.1 1.8.1 ubuntu18/python3.8/pip ubuntu18

ESPnet 5.9k Jan 3, 2023
Espresso: A Fast End-to-End Neural Speech Recognition Toolkit

Espresso Espresso is an open-source, modular, extensible end-to-end neural automatic speech recognition (ASR) toolkit based on the deep learning libra

Yiming Wang 919 Jan 3, 2023
Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.

OpenSpeech provides reference implementations of various ASR modeling papers and three languages recipe to perform tasks on automatic speech recogniti

Soohwan Kim 26 Dec 14, 2022
Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.

OpenSpeech provides reference implementations of various ASR modeling papers and three languages recipe to perform tasks on automatic speech recogniti

Soohwan Kim 86 Jun 11, 2021
Athena is an open-source implementation of end-to-end speech processing engine.

Athena is an open-source implementation of end-to-end speech processing engine. Our vision is to empower both industrial application and academic research on end-to-end models for speech processing. To make speech processing available to everyone, we're also releasing example implementation and recipe on some opensource dataset for various tasks (Automatic Speech Recognition, Speech Synthesis, Voice Conversion, Speaker Recognition, etc).

Ke Technologies 34 Sep 8, 2022
Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.

?? Contributing to OpenSpeech ?? OpenSpeech provides reference implementations of various ASR modeling papers and three languages recipe to perform ta

Openspeech TEAM 513 Jan 3, 2023
A demo for end-to-end English and Chinese text spotting using ABCNet.

ABCNet_Chinese A demo for end-to-end English and Chinese text spotting using ABCNet. This is an old model that was trained a long ago, which serves as

Yuliang Liu 45 Oct 4, 2022
Rhasspy 673 Dec 28, 2022