Source code for "Efficient Training of BERT by Progressively Stacking"

Overview

Introduction

This repository is the code to reproduce the result of Efficient Training of BERT by Progressively Stacking. The code is based on Fairseq.

Requirements and Installation

  • PyTorch >= 1.0.0
  • For training new models, you'll also need an NVIDIA GPU and NCCL
  • Python version 3.7

After PyTorch is installed, you can install requirements with:

pip install -r requirements.txt

Getting Started

Step 1:

bash install.sh

This script downloads:

  1. Moses Decoder
  2. Subword NMT
  3. Fast BPE (In the next steps, we use Subword NMT instead of Fast BPE. Recommended if you want to generate your own dictionary on a large-scale dataset.)

These library will do cleaning, tokenization, and BPE encoding for GLUE data in step 3. They will also be helpful if you want to make your own corpus for BERT training or if you want to test our model on your own tasks.

Step 2:

bash reproduce_bert.sh

This script runs progressive stacking and train a BERT. The code is tested on 4 Tesla P40 GPUs (24GB Gmem). For different hardware, you probably need to change the maximum number of tokens per batch (by changing max-tokens and update-freq).

Step 3:

bash reproduce_glue.sh

This script fine-tunes the BERT trained in step 2. The script chooses the checkpoint trained for 400K steps, which is the same as the stacking model in our paper.

Cite

@InProceedings{pmlr-v97-gong19a,
  title = 	 {Efficient Training of {BERT} by Progressively Stacking},
  author = 	 {Gong, Linyuan and He, Di and Li, Zhuohan and Qin, Tao and Wang, Liwei and Liu, Tieyan},
  booktitle = 	 {Proceedings of the 36th International Conference on Machine Learning},
  pages = 	 {2337--2346},
  year = 	 {2019},
  editor = 	 {Chaudhuri, Kamalika and Salakhutdinov, Ruslan},
  volume = 	 {97},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Long Beach, California, USA},
  month = 	 {09--15 Jun},
  publisher = 	 {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v97/gong19a/gong19a.pdf},
  url = 	 {http://proceedings.mlr.press/v97/gong19a.html},
}
You might also like...
The official source code for Ghost Discord selfbot.

👻 Ghost Selfbot The official code for Ghost which was recently discontinued and released to the public. Feel free to use any of the code found in thi

Unit testing AWS interactions with pytest and moto. These examples demonstrate how to structure, setup, teardown, mock, and conduct unit testing. The source code is only intended to demonstrate unit testing.

Unit Testing Interactions with Amazon Web Services (AWS) Unit testing AWS interactions with pytest and moto. These examples demonstrate how to structu

Source code for Profile REST API

PROJECT PROFILE REST API Creating local development server: We will create a local development server that can run and test our API as we build it. We

Latest Open Source Code for Playing Music in Telegram Video Chat. Made with Pyrogram and Pytgcalls 💖

MusicPlayer_TG Latest Open Source Code for Playing Music in Telegram Video Chat. Made with Pyrogram and Pytgcalls 💖 Requirements 📝 FFmpeg NodeJS nod

EduuRobot Telegram bot source code.
EduuRobot Telegram bot source code.

EduuRobot A multipurpose Telegram Bot made with Pyrogram and asynchronous programming. Requirements Python 3.6+ An Unix-like operating system (Running

The source code of the bot that displays erotic images on Discord

説明 このコードはDiscord.pyとNeko APIを使ったNsfw画像表示ボットのソースコードです。 成人向けコンテンツを含むボットなので、不快になる方はこのボットの作成中止をおすすめします。 使い方 まず、install.batを起動してください。 そのあとに、config.json を開き

Source code of BobuxAdmin bot from Bobux Bot Development server.

BobuxAdmin Source code of BobuxAdmin bot from Bobux Bot Development server. The bot is written with usage of disnake and SQLite database. Functionalit

This repo provides the source code for
This repo provides the source code for "Cross-Domain Adaptive Teacher for Object Detection".

Cross-Domain Adaptive Teacher for Object Detection This is the PyTorch implementation of our paper: Cross-Domain Adaptive Teacher for Object Detection

Download archived malware from ActiveState's source code mirror

malware-archivist (ma) Tool to aid security researchers in dissecting malware. Often, repository maintainers will remove malicious packages entirely f

Comments
  • Missing checkpoint when train on L6

    Missing checkpoint when train on L6

    Hi,

    In the bert training script, it first train L3 then L6 then L12. We're able to train L3 and double the weights and finish training L6. However, when trying to double the weights from models/L6/checkpoint_1_70000.pt the file didn't exist.

    FileNotFoundError: [Errno 2] No such file or directory: 'models/L6/checkpoint_1_70000.pt'
    

    I noticed when training transformer_bert_L6_A12 it has --keep-interval-updates 5 but according to the help, this should keep last 5 interval checkpoints so we should be fine. I wonder do you have some idea why the training did not generate this checkpoint?

    opened by xerothermic 4
  • Typo in the reproduce_bert.sh ?

    Typo in the reproduce_bert.sh ?

    Hi,

    I'm trying to reproduce the BERT performance demostrated in the paper. I ran the reproduce_bert script but it failed at two places. I wonder if it's a simple typo?

    On line 26 ocorpus.train.txt had an extra 'o' On line 33, should old_corpus.train.tok.${i} beomes corpus.train.tok.${i}?

    https://github.com/gonglinyuan/StackingBERT/blob/d1556758a378164f462c4ae817bf2c0b53e618b7/reproduce_bert.sh#L26-L33

    opened by xerothermic 1
  • missing required pip package

    missing required pip package

    Hi,

    This is another minor issue. while preparing the dataset for training, it needs scipy package as well. Could you please update the requirements.txt?

    opened by xerothermic 0
  • Model checkpoint

    Model checkpoint

    Hi,

    Thank you for your codebase. I am wondering if you can share a pretrained checkpoint for the model you used in the paper, i.e. the model we can get after running reproduce_bert.sh.

    opened by Michaelvll 0
Owner
Gong Linyuan
Gong Linyuan
“ HOLA HUMANS 👋 I'M DAISYX 2.0 „ LATEST VERSION OF DAISYX.. Source Code of @Daisyxbot

DaisyX 2.0 A Powerful, Smart And Simple Group Manager ... Written with AioGram , Pyrogram and Telethon... The first AioGram based modified groupmanage

TeamDaisyX 153 Dec 6, 2022
🤟The VC Music Source code of @DaisyXBot ❤️ v3 Out now

DAISYXMUSIC V3 ?? A bot that can play music on telegram group's voice call Available on telegram as @DaisyXbot Whats new ?? Thumbnail Support Playlist

TeamDaisyX 207 Dec 5, 2022
“ HOLA HUMANS 👋 I'M DAISYX 2.0 ❤️ „ LATEST VERSION OF DAISYX.. Source Code of @Daisyxbot

❤️ DaisyX 2.0 ❤️ A Powerful, Smart And Simple Group Manager ... Written with AioGram , Pyrogram and Telethon... ⭐️ Thanks to everyone who starred Dais

TeamDaisyX 153 Dec 6, 2022
A Telegram UserBot to Play Radio in Voice Chats. This is also the source code of the userbot which is being used for playing Radio in @AsmSafone Channel.

Telegram Radio Player UserBot A Telegram UserBot to Play Radio in Channel or Group Voice Chats. This is also the source code of the userbot which is b

SAF ONE 44 Nov 12, 2022
Source Code for our bot that manages time and other functions of the server <3

Komi San wants you to study This repo contains the source code for our bot that manages time and other functions of the server <3 Features Your study

Komi San wants you to study 8 Nov 8, 2021
An Telegram Bot By @AsmSafone To Stream Videos in Telegram Voice Chat. This is Also The Source Code of The Bot Which is Being Used In @SafoTheBot Group! ❤️

Telegram Video Player Bot (Beta) An Telegram Bot By @AsmSafone To Stream Videos in Telegram Voice Chat. Special Features Supports Live Streaming From

SAF ONE 206 Jan 3, 2023
This is a Innexia Chat Bot Open Source Code 🤬

⚡ Innexia ⚡ A Powerful, Smart And Simple Chat Bot ... Written with Python... Available on Telegram as @InnexiaChatBot ❤️ Support ⭐️ Thanks to everyone

Dark Cyber 4 Oct 2, 2022
An Advanced Telegram Bot to Play Radio & Music in Voice Chat. This is Also The Source Code of The Bot Which is Being Used For Playing Radio in @AsmSafone Channel ❤️

Telegram Radio Player V3 An Advanced Telegram Bot to Play Nonstop Radio/Music/YouTube Live in Channel or Group Voice Chats. This is also the source co

SAF ONE 421 Jan 5, 2023
Source code from thenewboston Discord Bot with Python tutorial series.

Project Setup Follow the steps below to set up the project on your environment. Local Development Create a virtual environment with Python 3.7 or high

Bucky Roberts 24 Aug 19, 2022
This is Source Code of PdiskUploaderBot

PdiskUploaderBot This is the source code of PdiskUploaderBot. And the developer of this bot is AJTimePyro, His Telegram Channel & Group. You can use t

Abhijeet 8 Oct 20, 2022