Official Implementation of "DialogLM: Pre-trained Model for Long Dialogue Understanding and Summarization."

Microsoft

Last update: Dec 19, 2022

Related tags

Deep Learning DialogLM

Overview

DialogLM

Code for AAAI 2022 paper: DialogLM: Pre-trained Model for Long Dialogue Understanding and Summarization.

Pre-trained Models

We release two versions of pre-trained models.

DialogLM is based on UniLMv2. According to whether sparse attention is introduced, it can be divided into two different versions to process dialogs of different lengths.
DialogLED builds on Longformer-Encoder-Decoder (LED) architecture and uses window-based denoising as the pre-training task on a large amount of long dialogue data for further training. You can use its base version and large version directly through HuggingFace.

Datasets

Please download the five datasets we used in our paper here (AMI, ICSI, QMSum, ForeverDreaming, TVMegaSite).

Finetuning for Downstream Tasks

Please go to specific folders to apply them to downstream tasks related to long dialogues.

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.

Comments

URLs for unilm vocab.txt are deprecated

The vocab.txt links in tokenization_unilm.py are deprecated. Need to update the latest checkpoints links according to [link].(https://unilm.blob.core.windows.net/ckpt/unilm1.2-base-uncased-vocab.txt)

PRETRAINED_VOCAB_FILES_MAP = {
    'vocab_file':
    {
        'unilm-large-cased': "https://conversationhub.blob.core.windows.net/beit-share-public/ckpt/unilm-large-cased-vocab.txt",
        'unilm-base-cased': "https://conversationhub.blob.core.windows.net/beit-share-public/ckpt/unilm-base-cased-vocab.txt",
        'unilm1-large-cased': "https://conversationhub.blob.core.windows.net/beit-share-public/ckpt/unilm1-large-cased-vocab.txt",
        'unilm1-base-cased': "https://conversationhub.blob.core.windows.net/beit-share-public/ckpt/unilm1-base-cased-vocab.txt",
        'unilm1.2-base-uncased': "https://conversationhub.blob.core.windows.net/beit-share-public/ckpt/unilm1.2-base-uncased-vocab.txt"
    }
}

Thus, links in configuration_minilm.py, configuration_unilm.py, tokenization_minilm.py, tokenization_unilm.py need to be updated as well.

opened by TheBestHu 0

Pre-trained and fine-tuned models not generating dialogue summaries.

Hello, I'm trying to generate summaries on dialogues using the pre-trained model (MingZhong/DialogLED-large-5120) as well as a fine-tuned one (on AMI dataset). It does not produce any meaningful summaries, but instead it just replicates a part of the dialogue. I'm using an adaptation of AllenAI (allenai/led-large-16384-arxiv) and the Huggingface LEDTokenizer and LEDForConditionalGeneration scripts:

LONG_DIALOGUE = """ Here goes my long dialogue."""" import torch from transformers import LEDTokenizer, LEDForConditionalGeneration

tokenizer = LEDTokenizer.from_pretrained("AMI_DialogLED_large/") input_ids = tokenizer(LONG_DIALOGUE, return_tensors="pt").input_ids.to("cuda") global_attention_mask = torch.zeros_like(input_ids) global_attention_mask[:, 0] = 1

model = LEDForConditionalGeneration.from_pretrained("AMI_DialogLED_large/", return_dict_in_generate=True).to("cuda") sequences = model.generate(input_ids, global_attention_mask=global_attention_mask).sequences

summary = tokenizer.batch_decode(sequences) print(summary)

With this code the output is a gibberish dialogue, not an actual summary as shown in your model outputs. Is there another way to generate summaries, or should the model be used in a different way?

Thanks,

David

opened by dafraile 0
hyperparameters to replicate experiments

Hi! Thank you for sharing the dialogLM and dialogLED implementation. I wonder if it's possible to release the hyperparameters used for ForeverDreaming and TVMegaSite. I see the DialogLM_UniLM folder does use TVMegaSite as an example and provided the hyperparameters. Does fine-tuning on ForeverDreaming use slightly different parameters? In particular, knowing the num_training_steps and batchsize will be super useful for us to replicate the results presented in the paper.

opened by Bobby-Hua 0
error while running the script

ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 2) local_rank: 0 (pid: 1199) of binary: /usr/bin/python3 Traceback (most recent call last): File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/usr/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/usr/local/lib/python3.7/dist-packages/torch/distributed/launch.py", line 193, in main() File "/usr/local/lib/python3.7/dist-packages/torch/distributed/launch.py", line 189, in main launch(args) File "/usr/local/lib/python3.7/dist-packages/torch/distributed/launch.py", line 174, in launch run(args) File "/usr/local/lib/python3.7/dist-packages/torch/distributed/run.py", line 755, in run )(*cmd_args) File "/usr/local/lib/python3.7/dist-packages/torch/distributed/launcher/api.py", line 131, in call return launch_agent(self._config, self._entrypoint, list(args)) File "/usr/local/lib/python3.7/dist-packages/torch/distributed/launcher/api.py", line 247, in launch_agent failures=result.failures, torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

opened by manish142000 2
how DialogLM is able process 5,120 tokens without using hybrid attention approach?

As mentioned in the research paper -

DIALOGLM is obtained by further pre-training UNILMbase with the window-based denoising method. Its maximum input length is 5,120 and the tokens exceeding this length is truncated in the experiments. DIALOGLM-sparse additionally introduces the hybrid attention approach in the pre-training process of DIALOGLM, so its maximum length is increased to 8,192 tokens.

I am not understanding how DialogLM is able process 5,120 tokens without using hybrid attention approach? Since it is using UniLM V2 as backbone the max tokens it can process should be 512 right?

opened by Akshayextreme 0

Owner

Microsoft

Open source projects and samples from Microsoft

GitHub

Official PyTorch implementation for paper Context Matters: Graph-based Self-supervised Representation Learning for Medical Images

Context Matters: Graph-based Self-supervised Representation Learning for Medical Images Official PyTorch implementation for paper Context Matters: Gra

49 Nov 23, 2022

The official implementation of NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation [ICLR-2021]. https://arxiv.org/pdf/2101.12378.pdf

NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation [ICLR-2021] Release Notes The offical PyTorch implementation of NeMo, p

76 Nov 23, 2022

StyleGAN2-ADA - Official PyTorch implementation

Abstract: Training generative adversarial networks (GAN) using too little data typically leads to discriminator overfitting, causing training to diverge. We propose an adaptive discriminator augmentation mechanism that significantly stabilizes training in limited data regimes.

3.2k Dec 30, 2022

Official implementation of the ICLR 2021 paper

You Only Need Adversarial Supervision for Semantic Image Synthesis Official PyTorch implementation of the ICLR 2021 paper "You Only Need Adversarial S

272 Dec 28, 2022

Official PyTorch implementation of Joint Object Detection and Multi-Object Tracking with Graph Neural Networks

This is the official PyTorch implementation of our paper: "Joint Object Detection and Multi-Object Tracking with Graph Neural Networks". Our project website and video demos are here.

443 Dec 6, 2022

Official implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis https://arxiv.org/abs/2011.13775

CIPS -- Official Pytorch Implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis Requirements pip install -r requi

Multimodal Lab @ Samsung AI Center Moscow

201 Dec 21, 2022

Official pytorch implementation of paper "Image-to-image Translation via Hierarchical Style Disentanglement".

HiSD: Image-to-image Translation via Hierarchical Style Disentanglement Official pytorch implementation of paper "Image-to-image Translation

364 Dec 14, 2022

Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

IC-Conv This repository is an official implementation of the paper Inception Convolution with Efficient Dilation Search. Getting Started Download Imag

111 Dec 31, 2022

Official PyTorch Implementation of Unsupervised Learning of Scene Flow Estimation Fusing with Local Rigidity

UnRigidFlow This is the official PyTorch implementation of UnRigidFlow (IJCAI2019). Here are two sample results (~10MB gif for each) of our unsupervis

28 Nov 16, 2022

Official implementation of our paper "LLA: Loss-aware Label Assignment for Dense Pedestrian Detection" in Pytorch.

LLA: Loss-aware Label Assignment for Dense Pedestrian Detection This project provides an implementation for "LLA: Loss-aware Label Assignment for Dens

35 Dec 6, 2022

Official implementation of Self-supervised Graph Attention Networks (SuperGAT), ICLR 2021.

SuperGAT Official implementation of Self-supervised Graph Attention Networks (SuperGAT). This model is presented at How to Find Your Friendly Neighbor

127 Dec 28, 2022

An official implementation of "SFNet: Learning Object-aware Semantic Correspondence" (CVPR 2019, TPAMI 2020) in PyTorch.

PyTorch implementation of SFNet This is the implementation of the paper "SFNet: Learning Object-aware Semantic Correspondence". For more information,

87 Dec 30, 2022

This project is the official implementation of our accepted ICLR 2021 paper BiPointNet: Binary Neural Network for Point Clouds.

BiPointNet: Binary Neural Network for Point Clouds Created by Haotong Qin, Zhongang Cai, Mingyuan Zhang, Yifu Ding, Haiyu Zhao, Shuai Yi, Xianglong Li

59 Dec 17, 2022

Official code implementation for "Personalized Federated Learning using Hypernetworks"

Personalized Federated Learning using Hypernetworks This is an official implementation of Personalized Federated Learning using Hypernetworks paper. [

121 Dec 25, 2022

StyleGAN2 - Official TensorFlow Implementation

10.1k Dec 28, 2022

Old Photo Restoration (Official PyTorch Implementation)

Bringing Old Photo Back to Life (CVPR 2020 oral)

11.3k Dec 30, 2022

Official implementation of "GS-WGAN: A Gradient-Sanitized Approach for Learning Differentially Private Generators" (NeurIPS 2020)

GS-WGAN This repository contains the implementation for GS-WGAN: A Gradient-Sanitized Approach for Learning Differentially Private Generators (NeurIPS

46 Nov 9, 2022

Official PyTorch implementation of Spatial Dependency Networks.

Spatial Dependency Networks: Neural Layers for Improved Generative Image Modeling Đorđe Miladinović Aleksandar Stanić Stefan Bauer Jürgen Schmid

34 Jan 19, 2022

Official implementation of YOGO for Point-Cloud Processing

You Only Group Once: Efficient Point-Cloud Processing with Token Representation and Relation Inference Module By Chenfeng Xu, Bohan Zhai, Bichen Wu, T

67 Dec 20, 2022

Official Implementation of "DialogLM: Pre-trained Model for Long Dialogue Understanding and Summarization."

Related tags

Overview

DialogLM

Pre-trained Models

Datasets

Finetuning for Downstream Tasks

Contributing

Trademarks

Comments

URLs for unilm vocab.txt are deprecated

Pre-trained and fine-tuned models not generating dialogue summaries.

hyperparameters to replicate experiments

error while running the script

how DialogLM is able process 5,120 tokens without using hybrid attention approach?

Owner

Microsoft

Official PyTorch implementation for paper Context Matters: Graph-based Self-supervised Representation Learning for Medical Images

The official implementation of NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation [ICLR-2021]. https://arxiv.org/pdf/2101.12378.pdf

StyleGAN2-ADA - Official PyTorch implementation

Official implementation of the ICLR 2021 paper

Official PyTorch implementation of Joint Object Detection and Multi-Object Tracking with Graph Neural Networks

Official implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis https://arxiv.org/abs/2011.13775

Official pytorch implementation of paper "Image-to-image Translation via Hierarchical Style Disentanglement".

Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

Official PyTorch Implementation of Unsupervised Learning of Scene Flow Estimation Fusing with Local Rigidity

Official implementation of our paper "LLA: Loss-aware Label Assignment for Dense Pedestrian Detection" in Pytorch.

Official implementation of Self-supervised Graph Attention Networks (SuperGAT), ICLR 2021.

An official implementation of "SFNet: Learning Object-aware Semantic Correspondence" (CVPR 2019, TPAMI 2020) in PyTorch.

This project is the official implementation of our accepted ICLR 2021 paper BiPointNet: Binary Neural Network for Point Clouds.

Official code implementation for "Personalized Federated Learning using Hypernetworks"

StyleGAN2 - Official TensorFlow Implementation

Old Photo Restoration (Official PyTorch Implementation)

Official implementation of "GS-WGAN: A Gradient-Sanitized Approach for Learning Differentially Private Generators" (NeurIPS 2020)

Official PyTorch implementation of Spatial Dependency Networks.

Official implementation of YOGO for Point-Cloud Processing