Pytorch implementation of Integrating Tree Path in Transformer for Code Representation

Han Peng

Last update: Dec 23, 2022

Related tags

Deep Learning TPTrans

Overview

This is an official Pytorch implementation of the approaches proposed in:

Han Peng, Ge Li, Wenhan Wang, Yunfei Zhao, Zhi Jin “Integrating Tree Path in Transformer for Code Representation”

which appeared at NeurIPS 2021[Paper Link][Poster][Slides].

In this paper, we investigate the interaction between the absolute and relative path encoding, and propose novel code representation model TPTrans and its variants, which introduce path encoding inductive bias into the attention module of Transformer and power Transformer to know the structure of source codes.

Please cite our paper if you use the model, experimental results, or our code in your own work.

1.1 Raw data

To run experiments with TPTrans and its variants, please first create datasets from raw code snippets of CodeSearchNet dataset. Download and unzip the raw jsonl data of CSN into the raw_data dir like that

├── raw_data        
│   ├── python         
│   │   ├── train    
│   │   │   ├── XXXX.jsonl...
│   │   ├── test    
│   │   ├── valid   
│   ├── ruby          
│   ├── go        
│   ├── javascript

1.2 Tree-Sitter

The Tree-Sitter is a open-source parser for multi-language programming languages. Please install it and then download the grammer files into vendor dir for four different programming languages like that

├── vendor        
│   ├── tree-sitter-python  (from https://github.com/tree-sitter/tree-sitter-python)         
│   ├── tree-sitter-javascript  (from https://github.com/tree-sitter/tree-sitter-javascript)     
│   ├── tree-sitter-go  (from https://github.com/tree-sitter/tree-sitter-go)
│   ├── tree-sitter-ruby  (from https://github.com/tree-sitter/tree-sitter-ruby)

After that, run the multi_language_parse.py in parser dir to parse the raw code snippets into the data dir.

1.3 Training

After preprocessing, run the _main.py_ to train the model.

To run the TPTrans, please specify the relation_path=True and absolute_path=False.

To run the TPTrans-\alpha, please specify the relation_path=True and absolute_path=True.

For other command triggers, please refer the comment inline for details.

Contact If you have any questions, please contact me via email: [email protected] or open issue on Github.

Tree LSTM implementation in PyTorch

Tree-Structured Long Short-Term Memory Networks This is a PyTorch implementation of Tree-LSTM as described in the paper Improved Semantic Representati

529 Dec 10, 2022

Pytorch implementation of the paper Progressive Growing of Points with Tree-structured Generators (BMVC 2021)

PGpoints Pytorch implementation of the paper Progressive Growing of Points with Tree-structured Generators (BMVC 2021) Hyeontae Son, Young Min Kim Pre

9 Jun 6, 2022

Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image classification, in Pytorch

Transformer in Transformer Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image c

272 Dec 23, 2022

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

12.6k Jan 9, 2023

Comments

link to your paper

Hello,Han

Hi, thanks for your excellent work.

Your paper "Integrating Tree Path in Transformer for Code Representation" is very nice,but I didn't find the link to your paper. Can you provide a way to read your paper?

thank you!

Ma Yingwei

[email protected]

opened by yingweima2022 2

Pytorch implementation of Integrating Tree Path in Transformer for Code Representation

Related tags

Overview

1.1 Raw data

1.2 Tree-Sitter

1.3 Training

You might also like...

Tree LSTM implementation in PyTorch

Pytorch implementation of the paper Progressive Growing of Points with Tree-structured Generators (BMVC 2021)

Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image classification, in Pytorch

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

Eff video representation - Efficient video representation through neural fields

🌳 A Python-inspired implementation of the Optimum-Path Forest classifier.

PyTorch implementation code for the paper MixCo: Mix-up Contrastive Learning for Visual Representation

Code for "Learning Structural Edits via Incremental Tree Transformations" (ICLR'21)

Code for Graph-to-Tree Learning for Solving Math Word Problems (ACL 2020)

Comments

link to your paper

Owner

Han Peng

Official Pytorch implementation of ICLR 2018 paper Deep Learning for Physical Processes: Integrating Prior Scientific Knowledge.

The official code for paper "R2D2: Recursive Transformer based on Differentiable Tree for Interpretable Hierarchical Language Modeling".

This repository contains numerical implementation for the paper Intertemporal Pricing under Reference Effects: Integrating Reference Effects and Consumer Heterogeneity.

3D-Transformer: Molecular Representation with Transformer in 3D Space

MPViT:Multi-Path Vision Transformer for Dense Prediction

DRLib：A concise deep reinforcement learning library, integrating HER and PER for almost off policy RL algos.

3D Multi-Person Pose Estimation by Integrating Top-Down and Bottom-Up Networks

PyTorch code for 'Efficient Single Image Super-Resolution Using Dual Path Connections with Multiple Scale Learning'

Pytorch implementation for the EMNLP 2020 (Findings) paper: Connecting the Dots: A Knowledgeable Path Generator for Commonsense Question Answering

VSR-Transformer - This paper proposes a new Transformer for video super-resolution (called VSR-Transformer).