Google's Meena transformer chatbot implementation

Overview

Meena chatbot

Here's my attempt at recreating Meena, a state of the art chatbot developed by Google Research and described in the paper Towards a Human-like Open-Domain Chatbot.

For this implementation I used the tensor2tensor deep learning library, using an evolved transformer model as described in the paper.

The training set used is the OpenSubtitles corpus in the Italian language. Many other languages are available here.

Model

Similarly to the work done in the paper, this model consists of 1 encoder block and 12 decoder blocks for a total of 108M parameters. The optimizer used is Adafactor with the same training rate schedule as described in the paper.

Training

Here are the results after training the model on 40M sentences of the OpenSubtitles dataset in the italian language. The learning rate starts at 0.01 and remains constant for 10k steps then decay with the inverse square root of the number of steps.

Learning rate schedule

Here's the plot of the evaluation loss during training. Evaluation loss plot

The final perplexity score is 10.4 which is very close to the perplexity score achieved by Google's meena chatbot 10.2.

The paper shows a correlation between perplexity score and the Sensibleness and Specificity Average which is correlated with the "human likeness" of the chatbot. Our perplexity score shows that our bot is better than other chatbots such as Cleverbot and DialoGPT: Perplexity SSA correlation

The dataset used however does not represent well normal conversations between humans. However Opensubtitles provide very large datasets in many languages.

Run pretrained model

Simply run notebook meena_chatbot_inference.ipynb.

Otherwise download the following model and extract it. Set proper MODEL_DIR and CHECKPOINT_NAME in predict.py and run main.py

Pretrained model checkpoint

Train a new model

For training simply run the ipython notebook on Google Colab, the model will be saved on Google Drive. At the end of the execution you can interact with the chatbot.

Export the model

The model can be exported by copying the following files in a folder:

  • hparams.json
  • The trained model checkpoint
  • The vocabulary .subwords file

and run main.py after setting the proper model directory.

Serving

server.py provides a simple HTTP API for serving the chatbot.

Comments
  • bot response is so slow?

    bot response is so slow?

    Hi, first of all am so thankful for your good job creating this bot. But, as i run your bot on google colab it response to me very very slow? it's normal? or anything else? please let me know?

    Best Wishes,

    opened by danial1995 4
  • Not getting good results ?

    Not getting good results ?

    hi @frankplus first of all thanks for your work, so I use your Jupyter notebook to train the model on the open subtitles English dataset but not getting good results. I trained it for like 3 hours on colab pro gpu, but the training goes on then i stopped it and tried the code for testing the model . but seems like the results were not that good. Can you you please suggest some better way to make mods of the meena, or you can guide me how can use your repo to use efficiently.

    opened by alan-ai-learner 2
  • How make working under TF2?

    How make working under TF2?

    First of all, thank you for your effort. I'd like to implement your code on my personal robot, beginning with some tests on PC Unfortunately without success, maybe because I've installed TF2 (2.10.0) on python 3.8.10 and the code seems written for TF1 . Is it true? I spent several days trying to convert code to TF2 using 'tf.compat.v1.xxxx' , but without success ( I'm an absolute newbie in Deep Learning). After loading the model 'model.ckpt-200000' from Italian_108M, in predict.py with :

        with tf.compat.v1.Session() as sess:
            new_saver = tf.compat.v1.train.import_meta_graph(MODEL_DIR + CHECKPOINT_NAME +'.meta')
            new_saver.restore(sess, MODEL_DIR+   CHECKPOINT_NAME) #'model.ckpt-200000'
    
    

    I get exception 'Attempting to capture an EagerTensor without building a function' on nest statement: output_candidates = [chatbot_model.infer(encoded_inputs, decode_length=1) for _ in range(NUM_SAMPLES)]

    At this point I don't know anymore how to go on... So my question is if is it possible to have some help on porting the code to TF2. Thank you in advance

    opened by Luke1962 0
  • Not getting better results after training for several days?

    Not getting better results after training for several days?

    Hi @frankplus, as you got the better results then might be the problem is from my side so, i used this dataset , so please take a look into this dataset and let me know. Can you suggest me a dataset for the english conversation

    Thanks

    opened by alan-ai-learner 4
  • Where is the source file on the nlpl page exactly?

    Where is the source file on the nlpl page exactly?

    The notebook references http://opus.nlpl.eu/download.php?f=OpenSubtitles/v2018/mono/OpenSubtitles.it.gz as the source, when I visit the linked opus.nlpl.eu page I see this grid with a bunch of LANG.xml.gz files - I cannot seem to locate a different file than Italian - can you link me to the exact page where I can find alternatives to Italian language so that I can train the model with a different data source please?

    opened by tgmerritt 1
Owner
Francesco Pham
MSc Computer Engineering
Francesco Pham
An IVR Chatbot which can exponentially reduce the burden of companies as well as can improve the consumer/end user experience.

IVR-Chatbot Achievements ?? Team Uhtred won the Maverick 2.0 Bot-a-thon 2021 organized by AbInbev India. ❓ Problem Statement As we all know that, lot

ARYAMAAN PANDEY 9 Dec 8, 2022
customer care chatbot made with Rasa Open Source.

Customer Care Bot Customer care bot for ecomm company which can solve faq and chitchat with users, can contact directly to team. ?? Features Basic E-c

Dishant Gandhi 23 Oct 27, 2022
Club chatbot

Chatbot Club chatbot Instructions to get the Chatterbot working Step 1. First make sure you are using a version of Python 3 or newer. To check your ve

null 5 Mar 7, 2022
Smart discord chatbot integrated with Dialogflow

academic-NLP-chatbot Smart discord chatbot integrated with Dialogflow to interact with students naturally and manage different classes in a school. De

Tom Huynh 5 Oct 24, 2022
Chatbot with Pytorch, Python & Nextjs

Installation Instructions Make sure that you have Python 3, gcc, venv, and pip installed. Clone the repository $ git clone https://github.com/sahr

Rohit Sah 0 Dec 11, 2022
Kurumi ChatBot

KurumiChatBot Just another Telegram AI chat bot written in Python using Pyrogram. A public running instance can be found on telegram as @TokisakiChatB

Yoga Pranata 3 Jun 28, 2022
Smart discord chatbot integrated with Dialogflow to manage different classrooms and assist in teaching!

smart-school-chatbot Smart discord chatbot integrated with Dialogflow to interact with students naturally and manage different classes in a school. De

Tom Huynh 5 Oct 24, 2022
This is a Prototype of an Ai ChatBot "Tea and Coffee Supplier" using python.

Ai-ChatBot-Python A chatbot is an intelligent system which can hold a conversation with a human using natural language in real time. Due to the rise o

null 1 Oct 30, 2021
A Semi-Intelligent ChatBot filled with statistical and economical data for the Premier League.

MONEYBALL - ChatBot Module: 4006CEM, Class: B, Group: 5 Contributors: Jonas Djondo Roshan Kc Cole Samson Daniel Rodrigues Ihteshaam Naseer Kind remind

Jonas Djondo 1 Nov 18, 2021
COVID-19 Chatbot with Rasa 2.0: open source conversational AI

COVID-19 chatbot implementation with Rasa open source 2.0, conversational AI framework.

Aazim Parwaz 1 Dec 23, 2022
A Facebook Messenger Chatbot using NLP

A Facebook Messenger Chatbot using NLP This project is about creating a messenger chatbot using basic NLP techniques and models like Logistic Regressi

null 6 Nov 20, 2022
🤖 Basic Financial Chatbot with handoff ability built with Rasa

Financial Services Example Bot This is an example chatbot demonstrating how to build AI assistants for financial services and banking with Rasa. It in

Mohammad Javad Hossieni 4 Aug 10, 2022
Main repository for the chatbot Bobotinho.

Bobotinho Bot Main repository for the chatbot Bobotinho. ℹ️ Introduction Twitch chatbot with entertainment commands. ‎ ?? Technologies Concurrent code

Bobotinho 14 Nov 29, 2022
Chatbot for the Chatango messaging platform

BroiestBot The baddest bot in the game right now. Uses the ch.py framework for joining Chantango rooms and responding to user messages. Commands If a

Todd Birchard 3 Jan 17, 2022
Creating a python chatbot that Starbucks users can text to place an order + help cut wait time of a normal coffee.

Creating a python chatbot that Starbucks users can text to place an order + help cut wait time of a normal coffee.

null 2 Jan 20, 2022
LewusBot - Twitch ChatBot built in python with twitchio library

LewusBot Twitch ChatBot built in python with twitchio library. Uses twitch/leagu

Lewus 25 Dec 4, 2022
A simple chatbot based on chatterbot that you can use for anything has basic features

Chatbotium A simple chatbot based on chatterbot that you can use for anything has basic features. I have some errors Read the paragraph below: Known b

Herman 1 Feb 16, 2022
Bu Chatbot, Konya Bilim Merkezi Yen için tasarlanmış olan bir projedir.

chatbot Bu Chatbot, Konya Bilim Merkezi Yeni Ufuklar Sergisi için 2021 Yılında tasarlanmış olan bir projedir. Chatbot Python ortamında yazılmıştır. Sö

Emre Özkul 1 Feb 23, 2022
**NSFW** A chatbot based on GPT2-chitchat

DangBot -- 好怪哦,再来一句 卡群怪话bot,powered by GPT2 for Chinese chitchat Training Example: python train.py --lr 5e-2 --epochs 30 --max_len 300 --batch_size 8

Tommy Yang 11 Jul 21, 2022