Code for Transformer Hawkes Process, ICML 2020.

Overview

Transformer Hawkes Process

Source code for Transformer Hawkes Process (ICML 2020).

Run the code

Dependencies

  • Python 3.7.
  • Anaconda contains all the required packages.
  • PyTorch version 1.4.0.

Instructions

  1. Put the data folder inside the root folder, modify the data entry in run.sh accordingly. The datasets are available here.
  2. bash run.sh to run the code.

Note

  • Right now the code only supports single GPU training, but an extension to support multiple GPUs should be easy.
  • The reported event time prediction RMSE and the time stamps provided in the datasets are not of the same unit, i.e., the provided time stamps can be in minutes, but the reported results are in hours.
  • There are several factors that can be changed, beside the ones in run.sh:
    • In Main.py, function train_epoch, the event time prediction squared error needs to be properly scaled to stabilize training. In the meantime, also scale the diff variable in function time_loss in Utils.py.
    • In Utils.py, function log_likelihood, users can select whether to use numerical integration or Monte Carlo integration.
    • In transformer/Models.py, class Transformer, there is an optional recurrent layer. This is inspired by the fact that additional recurrent layers can better capture the sequential context, as suggested in this paper. In reality, this may or may not help, depending on the dataset.

Reference

Please cite the following paper if you use this code.

@article{zuo2020transformer,
  title={Transformer Hawkes Process},
  author={Zuo, Simiao and Jiang, Haoming and Li, Zichong and Zhao, Tuo and Zha, Hongyuan},
  journal={arXiv preprint arXiv:2002.09291},
  year={2020}
}
Comments
  • I try to reproduce the results of your experiment about RMSE.

    I try to reproduce the results of your experiment about RMSE.

    Dear Zuo, I hope everything is fine with you, after saw your paper, I am really interested in you work, and learned a lot from it. However, I have a little question, how do you get the so small RMSE? While I get the RMSE on Financal data, is like Minimum RMSE: xx.xxxxx while in the paper, it is only 0.93. Are there some hyper-parameters about loss function need to adjustment? I set loss = 0 * event_loss + 0* pred_loss + se / scale_time_loss But it still doesn' t work, could you offer me some help? Thank you.

    Sincerely yours, Luning Zhang

    opened by DavidZhang88 3
  • Loss setting: can we mix likelihood and RMSE together?

    Loss setting: can we mix likelihood and RMSE together?

    In the Paper the Loss function contains 3 parts: image

    The first part is the log-likelihood, the second is the event cross-entropy loss, and the third is the time RMSE loss.

    I don't know if it is appropriate to mix these all together. As I expect, if the log-likelihood is applied, then to predict the next event's time, the only way is to calculate the expectation of PDF: image

    Can anyone help me make further clarification?

    opened by waystogetthere 0
  • Ambiguity in calculating log likelihood

    Ambiguity in calculating log likelihood

    Hi, I have the following issue regarding calculating LL in Utils.py in line 49 inside the function "compute_integral_unbiased":

    temp_hid = torch.sum(temp_hid * type_mask[:, 1:, :], dim=2, keepdim=True)

    you have only considered the occurred events for calculating integral while according to the formula we should compute the integral of each event type. I would expect the output dimension of this function to be [Batch_size, Length, Num_types]. then we should sum over all num_types instead of reducing it to only occurred events.

    I believe that this underestimation of this integral has led to your high overall LL compared to other studies.

    looking forward to your clarifications

    opened by hojjatkarami 4
  • The performances  in the paper is not reproduced.

    The performances in the paper is not reproduced.

    Hi, I tried to reproduce the transformer hawkes process on StackOverflow fold1. However, the results of accuracy and RMSE is as below.

    ![image](https://user-images.githubusercontent.com/56212725/173004512-ba357b4d-244a-4f73-9ca1-9d9535f3f1df.png

    화면 캡처 2022-06-10 153013)

    I think I have something missing. Compared to the relased code of Self-Attentive Hawkes process, I think it is not because of scaling factor. What does make the difference between the paper and this repository?

    opened by KanghoonYoon 7
  • Should event likelihood be computed using current or last hidden state?

    Should event likelihood be computed using current or last hidden state?

    Suppose the transformer hidden state at event i is h_i, should the likelihood of this event be computed using h_i or h_{i-1}?

    Using h_{i-1} makes more sense to me because this will encourage model to assign high intensity to the true next event, therefore learn to forecast.

    But the implementation and the paper seem to be using h_i. The problem is that, since the transformer is given the true event i as part of the input, it can simply learn to output infinitely high intensity for the correct event type in order to maximize the likelihood. Still, the learned model will have no predictive power.

    I feel I must have missed something. Any clarification is appreciated. Thanks.

    opened by mistycheney 2
  • Instructions to obtain Structured-THP datasets

    Instructions to obtain Structured-THP datasets

    Could you please provide additional details on how to obtain the 911-Calls and Earthquake datasets used in your paper? The CSV found at the provided webpage has 663,522 calls, all of which are in the EMS, fire, or traffic categories. For the 75 most frequent ZIP codes in this dataset, there are 582,045 total calls, which is considerably more than the 290,293 listed in Table 1 (see below code).

    import pandas as pd
    
    df = pd.read_csv("911.csv")
    print(len(df))  # 663522
    cats = ["EMS: ", "Fire: ", "Traffic: "]
    in_cats = 0
    for title in df["title"]:
        for cat in cats:
            if cat in title:
                in_cats += 1
                break
    
    print(in_cats)  # 663522
    zip_calls = (
        df.groupby("zip")
        .size()
        .reset_index(name="n_calls")
        .sort_values("n_calls", ascending=False)
    )
    print(zip_calls["n_calls"][:75].sum())  # 582045
    

    The paper also states that:

    An undirected edge exists between two vertices if their zipcodes are within 10 of each other.

    Does this mean two vertices were considered neighbors if abs(ZIP_{1} - ZIP_{2}) <= 10?

    For the Earthquake dataset, the provided website is in Chinese and seems to host a number of datasets. Could you provide precise instructions on where to find the specific earthquake dataset used in your paper?

    opened by airalcorn2 0
Owner
Simiao Zuo
PhD Student @ Georgia Tech
Simiao Zuo
[ICML 2020] Prediction-Guided Multi-Objective Reinforcement Learning for Continuous Robot Control

PG-MORL This repository contains the implementation for the paper Prediction-Guided Multi-Objective Reinforcement Learning for Continuous Robot Contro

MIT Graphics Group 65 Jan 7, 2023
The implementation of the algorithm in the paper "Safe Deep Semi-Supervised Learning for Unseen-Class Unlabeled Data" published in ICML 2020.

DS3L This is the code for paper "Safe Deep Semi-Supervised Learning for Unseen-Class Unlabeled Data" published in ICML 2020. Setups The code is implem

Guolz 36 Oct 19, 2022
Decentralized Reinforcment Learning: Global Decision-Making via Local Economic Transactions (ICML 2020)

Decentralized Reinforcement Learning This is the code complementing the paper Decentralized Reinforcment Learning: Global Decision-Making via Local Ec

null 40 Oct 30, 2022
PyTorch implementation of SCAFFOLD (Stochastic Controlled Averaging for Federated Learning, ICML 2020).

Scaffold-Federated-Learning PyTorch implementation of SCAFFOLD (Stochastic Controlled Averaging for Federated Learning, ICML 2020). Environment numpy=

KI 30 Dec 29, 2022
Code for the ICML 2021 paper: "ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision"

ViLT Code for the paper: "ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision" Install pip install -r requirements.txt pip

Wonjae Kim 922 Jan 1, 2023
Code for the ICML 2021 paper: "ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision"

ViLT Code for the paper: "ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision" Install pip install -r requirements.txt pip

Wonjae Kim 922 Jan 1, 2023
VSR-Transformer - This paper proposes a new Transformer for video super-resolution (called VSR-Transformer).

VSR-Transformer By Jiezhang Cao, Yawei Li, Kai Zhang, Luc Van Gool This paper proposes a new Transformer for video super-resolution (called VSR-Transf

Jiezhang Cao 225 Nov 13, 2022
Code for ICML 2021 paper: How could Neural Networks understand Programs?

OSCAR This repository contains the source code of our ICML 2021 paper How could Neural Networks understand Programs?. Environment Run following comman

Dinglan Peng 115 Dec 17, 2022
Official Code for ICML 2021 paper "Revisiting Point Cloud Shape Classification with a Simple and Effective Baseline"

Revisiting Point Cloud Shape Classification with a Simple and Effective Baseline Ankit Goyal, Hei Law, Bowei Liu, Alejandro Newell, Jia Deng Internati

Princeton Vision & Learning Lab 115 Jan 4, 2023
Code for Fold2Seq paper from ICML 2021

[ICML2021] Fold2Seq: A Joint Sequence(1D)-Fold(3D) Embedding-based Generative Model for Protein Design Environment file: environment.yml Data and Feat

International Business Machines 43 Dec 4, 2022
Official code for UnICORNN (ICML 2021)

UnICORNN (Undamped Independent Controlled Oscillatory RNN) [ICML 2021] This repository contains the implementation to reproduce the numerical experime

Konstantin Rusch 21 Dec 22, 2022
Code for the ICML 2021 paper "Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Training and Effective Adaptation", Haoxiang Wang, Han Zhao, Bo Li.

Bridging Multi-Task Learning and Meta-Learning Code for the ICML 2021 paper "Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Trainin

AI Secure 57 Dec 15, 2022
Code release for "Self-Tuning for Data-Efficient Deep Learning" (ICML 2021)

Self-Tuning for Data-Efficient Deep Learning This repository contains the implementation code for paper: Self-Tuning for Data-Efficient Deep Learning

THUML @ Tsinghua University 101 Dec 11, 2022
Code for the paper "Adversarially Regularized Autoencoders (ICML 2018)" by Zhao, Kim, Zhang, Rush and LeCun

ARAE Code for the paper "Adversarially Regularized Autoencoders (ICML 2018)" by Zhao, Kim, Zhang, Rush and LeCun https://arxiv.org/abs/1706.04223 Disc

Junbo (Jake) Zhao 399 Jan 2, 2023
Code release for the ICML 2021 paper "PixelTransformer: Sample Conditioned Signal Generation".

PixelTransformer Code release for the ICML 2021 paper "PixelTransformer: Sample Conditioned Signal Generation". Project Page Installation Please insta

Shubham Tulsiani 24 Dec 17, 2022
Code for paper "Which Training Methods for GANs do actually Converge? (ICML 2018)"

GAN stability This repository contains the experiments in the supplementary material for the paper Which Training Methods for GANs do actually Converg

Lars Mescheder 885 Jan 1, 2023
UDP++ (ECCVW 2020 Oral), (Winner of COCO 2020 Keypoint Challenge).

UDP-Pose This is the pytorch implementation for UDP++, which won the Fisrt place in COCO Keypoint Challenge at ECCV 2020 Workshop. Top-Down Results on

null 20 Jul 29, 2022
Code for SentiBERT: A Transferable Transformer-Based Architecture for Compositional Sentiment Semantics (ACL'2020).

SentiBERT Code for SentiBERT: A Transferable Transformer-Based Architecture for Compositional Sentiment Semantics (ACL'2020). https://arxiv.org/abs/20

Da Yin 66 Aug 13, 2022
Code to reproduce the experiments in the paper "Transformer Based Multi-Source Domain Adaptation" (EMNLP 2020)

Transformer Based Multi-Source Domain Adaptation Dustin Wright and Isabelle Augenstein To appear in EMNLP 2020. Read the preprint: https://arxiv.org/a

CopeNLU 36 Dec 5, 2022