Source code of SIGIR2021 Paper 'One Chatbot Per Person: Creating Personalized Chatbots based on Implicit Profiles'

Overview

DHAP

Source code of SIGIR2021 Long Paper:

One Chatbot Per Person: Creating Personalized Chatbots based on Implicit User Profiles .

Preinstallation

First, install the python packages in your Python3 environment:

  git clone https://github.com/zhengyima/DHAP.git DHAP
  cd DHAP
  pip install -r requirements.txt

Then, you should download the pre-trained word embeddings to initialize the model training. We provide two word embeddings in the Google Drive:

  • sgns.weibo.bigram-char, folloing Li et al., Chinese word embeddings pre-trained on Weibo. Google Drive
  • Fasttext embeddings, English word embedding pre-trained on Reddit set. Google Drive

You can pre-train your own embeddings(with the same format, i.e., the standard txt format), and use it in the model.

After downloading, you should put the embedding file to the path EMB_FILE.

Data

You should provide the dialogue history of users for training the model. For convenience, we provide a very small subset of PChatbot in the data/ as the demo data. In the direcotry, each user's dialogue history is saved in one text file. Each line in the file should contain post text, user id of post, post timestamp, response text, user id of response, response timestamp, _, _ , with tab as the seperator.

You can refer to seq2seq/dataset/perdialogDatasets.py for more details about the data processing.

If you are interested in the dataset PChatbot, please go to its official repository for more details.

Model Training

We provide a shell script scripts/train_chat.sh to start model pre-training. You should modify the DATA_DIR and EMB_FILE to your own paths. Then, you can start training by the following command:

bash scripts/train_chat.sh

The hyper-parameters are defined and set in the configParser.py.

After training, the trained checkpoints are saved in outputs. The inferenced result is saved in RESULT_FILE, which you define in bash scripts/train_chat.sh

Evaluating

For calculating varities of evaluation metrics(e.g. BLEU, P-Cover...), we provide a shell script scripts/eval.sh. You should modify the EMB_FILE to your own path, then evaluate the results by the following command:

bash scripts/eval.sh

Citations

If our code helps you, please cite our work by:

@inproceedings{DBLP:conf/sigir/madousigir21,
     author = {Zhengyi Ma and Zhicheng Dou and Yutao Zhu Hanxun Zhong and Ji-Rong Wen}, 
     title = {One Chatbot Per Person: Creating Personalized Chatbots based onImplicit User Profiles}, 
     booktitle = {Proceedings of the {SIGIR} 2021}, 
     publisher = {{ACM}}, 
     year = {2021}, 
     url = {https://doi.org/10.1145/3404835.3462828}, 
     doi = {10.1145/3404835.3462828}}

Links

You might also like...
Listing arxiv - Personalized list of today's articles from ArXiv
Listing arxiv - Personalized list of today's articles from ArXiv

Personalized list of today's articles from ArXiv Print and/or send to your gmail

Regulatory Instruments for Fair Personalized Pricing.

Fair pricing Source code for WWW 2022 paper Regulatory Instruments for Fair Personalized Pricing. Installation Requirements Linux with Python = 3.6 p

(Personalized) Page-Rank computation using PyTorch

torch-ppr This package allows calculating page-rank and personalized page-rank via power iteration with PyTorch, which also supports calculation on GP

Code for the paper "Implicit Representations of Meaning in Neural Language Models"

Implicit Representations of Meaning in Neural Language Models Preliminaries Create and set up a conda environment as follows: conda create -n state-pr

Official code release for ICCV 2021 paper SNARF: Differentiable Forward Skinning for Animating Non-rigid Neural Implicit Shapes.
Official code release for ICCV 2021 paper SNARF: Differentiable Forward Skinning for Animating Non-rigid Neural Implicit Shapes.

Official code release for ICCV 2021 paper SNARF: Differentiable Forward Skinning for Animating Non-rigid Neural Implicit Shapes.

 This repository contains the code for the CVPR 2020 paper
This repository contains the code for the CVPR 2020 paper "Differentiable Volumetric Rendering: Learning Implicit 3D Representations without 3D Supervision"

Differentiable Volumetric Rendering Paper | Supplementary | Spotlight Video | Blog Entry | Presentation | Interactive Slides | Project Page This repos

This repository contains the code for the paper
This repository contains the code for the paper "PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization"

PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization News: [2020/05/04] Added EGL rendering option for training data g

This python-based package offers a way of creating a parametric OpenMC plasma source from plasma parameters.
This python-based package offers a way of creating a parametric OpenMC plasma source from plasma parameters.

openmc-plasma-source This python-based package offers a way of creating a parametric OpenMC plasma source from plasma parameters. The OpenMC sources a

Code for CVPR 2021 paper: Anchor-Free Person Search
Code for CVPR 2021 paper: Anchor-Free Person Search

Introduction This is the implementationn for Anchor-Free Person Search in CVPR2021 License This project is released under the Apache 2.0 license. Inst

Comments
  • ValueError: could not broadcast input array from shape (100,) into shape (300,)

    ValueError: could not broadcast input array from shape (100,) into shape (300,)

    So i am trying to run it on my MacBook Pro M2. I was following your directions, but I am getting this error. Help please? I made the changes to the bash files for data/emb file

    (torch-gpu) /home DHAP % bash scripts/train_chat.sh Traceback (most recent call last): File "runModel.py", line 56, in <module> src_vocab_list, embs = VocabField.load_from_pretrained(emb_file) File "/DHAP/seq2seq/dataset/vocabField.py", line 55, in load_from_pretrained embedding[i+5] = vec ValueError: could not broadcast input array from shape (100,) into shape (300,)

    opened by EiffelCEO 0
Owner
ZYMa
Master candidate. IR and NLP.
ZYMa
Official code implementation for "Personalized Federated Learning using Hypernetworks"

Personalized Federated Learning using Hypernetworks This is an official implementation of Personalized Federated Learning using Hypernetworks paper. [

Aviv Shamsian 121 Dec 25, 2022
Code for the paper One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation, CVPR 2021.

One Thing One Click One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation (CVPR2021) Code for the paper One Thi

null 44 Dec 12, 2022
JAX code for the paper "Control-Oriented Model-Based Reinforcement Learning with Implicit Differentiation"

Optimal Model Design for Reinforcement Learning This repository contains JAX code for the paper Control-Oriented Model-Based Reinforcement Learning wi

Evgenii Nikishin 43 Sep 28, 2022
Code for the AAAI-2022 paper: Imagine by Reasoning: A Reasoning-Based Implicit Semantic Data Augmentation for Long-Tailed Classification

Imagine by Reasoning: A Reasoning-Based Implicit Semantic Data Augmentation for Long-Tailed Classification (AAAI 2022) Prerequisite PyTorch >= 1.2.0 P

null 16 Dec 14, 2022
PSTR: End-to-End One-Step Person Search With Transformers (CVPR2022)

PSTR (CVPR2022) This code is an official implementation of "PSTR: End-to-End One-Step Person Search With Transformers (CVPR2022)". End-to-end one-step

Jiale Cao 28 Dec 13, 2022
This is code to fit per-pixel environment map with spherical Gaussian lobes, using LBFGS optimization

Spherical Gaussian Optimization This is code to fit per-pixel environment map with spherical Gaussian lobes, using LBFGS optimization. This code has b

null 41 Dec 14, 2022
Personalized Federated Learning using Pytorch (pFedMe)

Personalized Federated Learning with Moreau Envelopes (NeurIPS 2020) This repository implements all experiments in the paper Personalized Federated Le

Charlie Dinh 226 Dec 30, 2022
A PyTorch implementation of "Predict then Propagate: Graph Neural Networks meet Personalized PageRank" (ICLR 2019).

APPNP ⠀ A PyTorch implementation of Predict then Propagate: Graph Neural Networks meet Personalized PageRank (ICLR 2019). Abstract Neural message pass

Benedek Rozemberczki 329 Dec 30, 2022
Personalized Transfer of User Preferences for Cross-domain Recommendation (PTUPCDR)

This is the official implementation of our paper Personalized Transfer of User Preferences for Cross-domain Recommendation (PTUPCDR), which has been accepted by WSDM2022.

Yongchun Zhu 81 Dec 29, 2022
JudeasRx - graphical app for doing personalized causal medicine using the methods invented by Judea Pearl et al.

JudeasRX Instructions Read the references given in the Theory and Notation section below Fire up the Jupyter Notebook judeas-rx.ipynb The notebook dra

Robert R. Tucci 19 Nov 7, 2022