Official implementation for ICDAR 2021 paper "Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer"

Wenqi Zhao

Last update: Dec 27, 2022

Related tags

Overview

Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer

Description

Convert offline handwritten mathematical expression to LaTeX sequence using bidirectionally trained transformer.

How to run

First, install dependencies

# clone project   
git clone https://github.com/Green-Wood/BTTR

# install project   
cd BTTR
conda create -y -n bttr python=3.7
conda activate bttr
conda install --yes -c pytorch pytorch=1.7.0 torchvision cudatoolkit=<your-cuda-version>
pip install -e .

Next, navigate to any file and run it. It may take 6~7 hours to coverage on 4 gpus using ddp.

# module folder
cd BTTR

# train bttr model using 4 gpus and ddp
python train.py --config config.yaml

For single gpu user, you may change the config.yaml file to

gpus: 1
# gpus: 4
# accelerator: ddp

Imports

This project is setup as a package which means you can now easily import any file into any other file like so:

from bttr.datamodule import CROHMEDatamodule
from bttr import LitBTTR
from pytorch_lightning import Trainer

# model
model = LitBTTR()

# data
dm = CROHMEDatamodule(test_year=test_year)

# train
trainer = Trainer()
trainer.fit(model, datamodule=dm)

# test using the best model!
trainer.test(datamodule=dm)

Note

Metrics used in validation is not accurate.

For more accurate metrics:

use test.py to generate result.zip
download and install crohmelib, lgeval, and tex2symlg tool.
convert tex file to symLg file using tex2symlg command
evaluate two folder using evaluate command

Citation

@article{zhao2021handwritten,
  title={Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer},
  author={Zhao, Wenqi and Gao, Liangcai and Yan, Zuoyu and Peng, Shuai and Du, Lin and Zhang, Ziyin},
  journal={arXiv preprint arXiv:2105.02412},
  year={2021}
}

Comments

can you provide predict.py code?

Hi ~ @Green-Wood.

I feel grateful mind for your help. I wanna get predict.py code that prints latex from an input image. If this code is provided, it will be very useful to others as well.

Best regards.

opened by ai-motive 17
val_exprate=0 and save checkpoint

hello!thanks for your time! When I transfer some code in decoder or use it directly,the val_exprate are always be 0.000,I don't know why. Another problem is,I noticed that this code don't have the function to save checkpoint or something.Can you give me some help?Thanks again!

opened by Ashleyyyi 6
Val_exprate = 0

When I retrained the model according to the instruction, the val_exprate was always 0.00, did anyone encounter this problem, thank you! (I has not modified any codes) @Green-Wood

opened by qingqianshuying 4

test.py error occurs

When I run test.py code, the following error occurs. Can i get some helps?

in test.py code test_year = "2016" ckp_path = "pretrained model"

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
Load data from: /home/motive/PycharmProjects/BTTR/bttr/datamodule/../../data.zip
Extract data from: 2016, with data size: 1147
total  1147 batch data loaded
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Testing: 100%|██████████| 1147/1147 [07:34<00:00,  2.01s/it]ExpRate: 0.32258063554763794
length of total file: 1147
Testing: 100%|██████████| 1147/1147 [07:34<00:00,  2.52it/s]
--------------------------------------------------------------------------------
DATALOADER:0 TEST RESULTS
{}
--------------------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/motive/PycharmProjects/BTTR/test.py", line 17, in <module>
    trainer.test(model, datamodule=dm)
  File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 579, in test
    results = self._run(model)
  File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 759, in _run
    self.post_dispatch()
  File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 789, in post_dispatch
    self.accelerator.teardown()
  File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/pytorch_lightning/accelerators/gpu.py", line 51, in teardown
    self.lightning_module.cpu()
  File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/pytorch_lightning/utilities/device_dtype_mixin.py", line 141, in cpu
    return super().cpu()
  File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/torch/nn/modules/module.py", line 471, in cpu
    return self._apply(lambda t: t.cpu())
  File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/torch/nn/modules/module.py", line 359, in _apply
    module._apply(fn)
  File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/torchmetrics/metric.py", line 317, in _apply
    setattr(this, key, [fn(cur_v) for cur_v in current_val])
  File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/torchmetrics/metric.py", line 317, in <listcomp>
    setattr(this, key, [fn(cur_v) for cur_v in current_val])
  File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/torch/nn/modules/module.py", line 471, in <lambda>
    return self._apply(lambda t: t.cpu())
AttributeError: 'tuple' object has no attribute 'cpu'

opened by ai-motive 3

How long does BTTR take to train?

Hi, thank you for great repository!

How long does it take to train for your experiment in the paper? I mean training on CROHME 2014/2016/2019 on four NVIDIA 1080Ti GPUs.

Thanks,

opened by RyosukeFukatani 2
can you provide transfer learning code?

Hi~ @Green-Wood

I wanna apply trasnfer learning using pretrained model.

but, LightningCLI() is wrapped and difficult to customize.

Thanks & best regards.

opened by ai-motive 1
How can it get pretrained model ?

Hi, I wanna test your BTTR model but, it need to training process which will take a lot of time. So, can you give me a pretrained model link?

Best regards.

opened by ai-motive 1
After adding new token in dictionary getting error .

Hi , getting error after adding new token in dictionary.txt

Error(s) in loading state_dict for LitBTTR: size mismatch for bttr.decoder.word_embed.0.weight: copying a param with shape torch.Size([113, 256]) from checkpoint, the shape in current model is torch.Size([115, 256]). size mismatch for bttr.decoder.proj.weight: copying a param with shape torch.Size([113, 256]) from checkpoint, the shape in current model is torch.Size([115, 256]). size mismatch for bttr.decoder.proj.bias: copying a param with shape torch.Size([113]) from checkpoint, the shape in current model is torch.Size([115]).

Kindly help me out how can i fix this error.

opened by shivankaraditi 0
About dataset

Could you tell me how to generate the offline math expression image from inkml file? My experiment show that a large scale image could improve the result obviously，so I'd like to know if there is unified offline data for academic research.

opened by lightflash7 0
predicting on gpu is slower

Hi ,

As this model is a bit slower compared to the existing state-of-the-art model on CPU. So I tried to make predictions on GPU and surprisingly it slower on Gpu compare to CPU as well.

I am attaching a code snapshot here

device = torch.device('cuda')if torch.cuda.is_available() else torch.device('cpu')

model = LitBTTR.load_from_checkpoint('pretrained-2014.ckpt',map_location=device)

img = Image.open(img_path) img = ToTensor()(img) img.to(device)

t1 = time.time() hyp = model.beam_search(img) t2 = time.time()

Kindly help me out here how i can reduce prediction time

FYI - using GPU on aws g4dn.xlarge configuration machine

opened by Suma3 1
how to use TensorBoard?

hello i don't know how to add scalar to TensorBoard? I want to do this kind of topic, hoping to improve some ExpRate, but I don’t know much about lightning TensorBoard.

opened by win5923 9

Releases(v2.0)

v2.0(Jun 15, 2021)

new pretrained model
Source code(tar.gz)
Source code(zip)
pretrained-2014.ckpt(73.95 MB)

Owner

Wenqi Zhao

Student in Nanjing University

GitHub

PyTorch code of my ICDAR 2021 paper Vision Transformer for Fast and Efficient Scene Text Recognition (ViTSTR)

Vision Transformer for Fast and Efficient Scene Text Recognition (ICDAR 2021) ViTSTR is a simple single-stage model that uses a pre-trained Vision Tra

198 Dec 27, 2022

2nd solution of ICDAR 2021 Competition on Scientific Literature Parsing, Task B.

TableMASTER-mmocr Contents About The Project Method Description Dependency Getting Started Prerequisites Installation Usage Data preprocess Train Infe

298 Dec 21, 2022

1st Solution For ICDAR 2021 Competition on Mathematical Formula Detection

This project releases our 1st place solution on ICDAR 2021 Competition on Mathematical Formula Detection. We implement our solution based on MMDetection, which is an open source object detection toolbox based on PyTorch.

94 Dec 25, 2022

Official implementation of the ICLR 2021 paper

You Only Need Adversarial Supervision for Semantic Image Synthesis Official PyTorch implementation of the ICLR 2021 paper "You Only Need Adversarial S

272 Dec 28, 2022

Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

IC-Conv This repository is an official implementation of the paper Inception Convolution with Efficient Dilation Search. Getting Started Download Imag

111 Dec 31, 2022

This project is the official implementation of our accepted ICLR 2021 paper BiPointNet: Binary Neural Network for Point Clouds.

BiPointNet: Binary Neural Network for Point Clouds Created by Haotong Qin, Zhongang Cai, Mingyuan Zhang, Yifu Ding, Haiyu Zhao, Shuai Yi, Xianglong Li

59 Dec 17, 2022

This is an official implementation of our CVPR 2021 paper "Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression" (https://arxiv.org/abs/2104.02300)

Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression Introduction In this paper, we are interested in the bottom-up paradigm of estima

367 Dec 27, 2022

The official implementation of our CVPR 2021 paper - Hybrid Rotation Averaging: A Fast and Robust Rotation Averaging Approach

Graph Optimizer This repo contains the official implementation of our CVPR 2021 paper - Hybrid Rotation Averaging: A Fast and Robust Rotation Averagin

109 Dec 23, 2022

Official Pytorch Implementation of: "ImageNet-21K Pretraining for the Masses"(2021) paper

ImageNet-21K Pretraining for the Masses Paper | Pretrained models Official PyTorch Implementation Tal Ridnik, Emanuel Ben-Baruch, Asaf Noy, Lihi Zelni

574 Jan 2, 2023

Official Pytorch Implementation of: "Semantic Diversity Learning for Zero-Shot Multi-label Classification"(2021) paper

Semantic Diversity Learning for Zero-Shot Multi-label Classification Paper Official PyTorch Implementation Avi Ben-Cohen, Nadav Zamir, Emanuel Ben Bar

28 Aug 29, 2022

Official PyTorch implementation of the preprint paper "Stylized Neural Painting", accepted to CVPR 2021.

1.5k Dec 28, 2022

Official implementation of the paper Vision Transformer with Progressive Sampling, ICCV 2021.

Vision Transformer with Progressive Sampling This is the official implementation of the paper Vision Transformer with Progressive Sampling, ICCV 2021.

123 Jan 1, 2023

Official implementation of "An Image is Worth 16x16 Words, What is a Video Worth?" (2021 paper)

An Image is Worth 16x16 Words, What is a Video Worth? paper Official PyTorch Implementation Gilad Sharir, Asaf Noy, Lihi Zelnik-Manor DAMO Academy, Al

213 Nov 12, 2022

Official PyTorch implementation of the paper "Recycling Discriminator: Towards Opinion-Unaware Image Quality Assessment Using Wasserstein GAN", accepted to ACM MM 2021 BNI Track.

RecycleD Official PyTorch implementation of the paper "Recycling Discriminator: Towards Opinion-Unaware Image Quality Assessment Using Wasserstein GAN

23 Nov 5, 2022

Official implementation of the ICCV 2021 paper "Conditional DETR for Fast Training Convergence".

The DETR approach applies the transformer encoder and decoder architecture to object detection and achieves promising performance. In this paper, we handle the critical issue, slow training convergence, and present a conditional cross-attention mechanism for fast DETR training. Our approach is motivated by that the cross-attention in DETR relies highly on the content embeddings and that the spatial embeddings make minor contributions, increasing the need for high-quality content embeddings and thus increasing the training difficulty.

281 Dec 30, 2022

The official implementation of the Interspeech 2021 paper WSRGlow: A Glow-based Waveform Generative Model for Audio Super-Resolution.

WSRGlow The official implementation of the Interspeech 2021 paper WSRGlow: A Glow-based Waveform Generative Model for Audio Super-Resolution. Audio sa

96 Jan 3, 2023

The Official Implementation of the ICCV-2021 Paper: Semantically Coherent Out-of-Distribution Detection.

SCOOD-UDG (ICCV 2021) This repository is the official implementation of the paper: Semantically Coherent Out-of-Distribution Detection Jingkang Yang,

62 Nov 21, 2022

Official implementation of the ICCV 2021 paper: "The Power of Points for Modeling Humans in Clothing".

The Power of Points for Modeling Humans in Clothing (ICCV 2021) This repository contains the official PyTorch implementation of the ICCV 2021 paper: T

158 Nov 24, 2022

This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis, accepted at EMNLP 2021.

MultiModal-InfoMax This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Informa

Deep Cognition and Language Research (DeCLaRe) Lab

89 Dec 26, 2022

Official implementation for ICDAR 2021 paper "Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer"

Related tags

Overview

Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer

Description

How to run

Imports

Note

Citation

Comments

Releases(v2.0)

v2.0(Jun 15, 2021)

Owner

Wenqi Zhao

PyTorch code of my ICDAR 2021 paper Vision Transformer for Fast and Efficient Scene Text Recognition (ViTSTR)

2nd solution of ICDAR 2021 Competition on Scientific Literature Parsing, Task B.

1st Solution For ICDAR 2021 Competition on Mathematical Formula Detection

Official implementation of the ICLR 2021 paper

Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

This project is the official implementation of our accepted ICLR 2021 paper BiPointNet: Binary Neural Network for Point Clouds.

This is an official implementation of our CVPR 2021 paper "Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression" (https://arxiv.org/abs/2104.02300)

The official implementation of our CVPR 2021 paper - Hybrid Rotation Averaging: A Fast and Robust Rotation Averaging Approach

Official Pytorch Implementation of: "ImageNet-21K Pretraining for the Masses"(2021) paper

Official Pytorch Implementation of: "Semantic Diversity Learning for Zero-Shot Multi-label Classification"(2021) paper

Official PyTorch implementation of the preprint paper "Stylized Neural Painting", accepted to CVPR 2021.

Official implementation of the paper Vision Transformer with Progressive Sampling, ICCV 2021.

Official implementation of "An Image is Worth 16x16 Words, What is a Video Worth?" (2021 paper)

Official PyTorch implementation of the paper "Recycling Discriminator: Towards Opinion-Unaware Image Quality Assessment Using Wasserstein GAN", accepted to ACM MM 2021 BNI Track.

Official implementation of the ICCV 2021 paper "Conditional DETR for Fast Training Convergence".

The official implementation of the Interspeech 2021 paper WSRGlow: A Glow-based Waveform Generative Model for Audio Super-Resolution.

The Official Implementation of the ICCV-2021 Paper: Semantically Coherent Out-of-Distribution Detection.

Official implementation of the ICCV 2021 paper: "The Power of Points for Modeling Humans in Clothing".

This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis, accepted at EMNLP 2021.