An application that maps an image of a LaTeX math equation to LaTeX code.

Last update: Jan 6, 2023

Related tags

Image Processing latex deep-learning pytorch transformer encoder-decoder end-to-end-machine-learning streamlit-webapp

Overview

Image to LaTeX

An application that maps an image of a LaTeX math equation to LaTeX code.

$Image to Latex streamlit app$

Introduction

The problem of image-to-markup generation has been attempted by Deng et al. (2016). They provide the raw and preprocessed versions of im2latex-100K, a dataset consisting of about 100K LaTeX math equation images. Using their dataset, I trained a model that uses ResNet-18 as encoder (up to layer3) and a Transformer as decoder with cross-entropy loss.

Initially, I used the preprocessed dataset to train my model, but the preprocessing turned out to be a huge limitation. Although the model can achieve a reasonable performance on the test set, it performs poorly if the image quality, padding, or font size is different from the images in the dataset. This phenomenon has also been observed by others who have attempted the same problem using the same dataset (e.g., this project, this issue and this issue). This is most likely due to the rigid preprocessing for the dataset (e.g. heavy downsampling).

To this end, I used the raw dataset and included image augmentation (e.g. random scaling, small rotation) in my data processing pipeline to increase the diversity of the samples. Moreover, unlike Deng et al. (2016), I did not group images by size. Rather, I sampled them uniformly and padded them to the size of the largest image in the batch, to increase the generalizability of the model.

Additional problems that I found in the dataset:

Some latex code produces visually identical outputs (e.g. \left( and \right) look the same as ( and )), so I normalized them.
Some latex code is used to add space (e.g. \vspace{2px} and \hspace{0.3mm}). However, the length of the space is diffcult to judge. Also, I don't want the model generates code on blank images, so I removed them.

The best run has a character error rate (CER) of 0.17 in test set. Most errors seem to come from unnecessary horizontal spacing, e.g., \;, \, and \qquad. (I only removed \vspace and \hspace during preprocessing. I did not know that LaTeX has so many horizontal spacing commands.)

Possible improvements include:

Do a better job cleaning the data (e.g., removing spacing commands)
Train the model for more epochs (for the sake of time, I only trained the model for 15 epochs, but the validation loss is still going down)
Use beam search (I only implemented greedy search)
Use a larger model (e.g., use ResNet-34 instead of ResNet-18)
Do some hyperparameter tuning

I didn't do any of these, because I had limited computational resources (I was using Google Colab).

How To Use

Setup

Clone the repository to your computer and position your command line inside the repository folder:

git clone https://github.com/kingyiusuen/image-to-latex.git
cd image-to-latex

Then, create a virtual environment named venv and install required packages:

make venv
make install-dev

Data Preprocessing

Run the following command to download the im2latex-100k dataset and do all the preprocessing. (The image cropping step may take over an hour.)

python scripts/prepare_data.py

Model Training and Experiment Tracking

Model Training

An example command to start a training session:

python scripts/run_experiment.py trainer.gpus=1 data.batch_size=32

Configurations can be modified in conf/config.yaml or in command line. See Hydra's documentation to learn more.

Experiment Tracking using Weights & Biases

The best model checkpoint will be uploaded to Weights & Biases (W&B) automatically (you will be asked to register or login to W&B before the training starts). Here is an example command to download a trained model checkpoint from W&B:

python scripts/download_checkpoint.py RUN_PATH

Replace RUN_PATH with the path of your run. The run path should be in the format of //. To find the run path for a particular experiment run, go to the Overview tab in the dashboard.

For example, you can use the following command to download my best run

python scripts/download_checkpoint.py kingyiusuen/image-to-latex/1w1abmg1

The checkpoint will be downloaded to a folder named artifacts under the project directory.

Testing and Continuous Integration

The following tools are used to lint the codebase:

isort: Sorts and formats import statements in Python scripts.

black: A code formatter that adheres to PEP8.

flake8: A code linter that reports stylistic problems in Python scripts.

mypy: Performs static type checking in Python scripts.

Use the following command to run all the checkers and formatters:

make lint

See pyproject.toml and setup.cfg at the root directory for their configurations.

Similar checks are done automatically by the pre-commit framework when a commit is made. Check out .pre-commit-config.yaml for the configurations.

Deployment

An API is created to make predictions using the trained model. Use the following command to get the server up and running:

make api

You can explore the API via the generated documentation at http://0.0.0.0:8000/docs.

To run the Streamlit app, create a new terminal window and use the following command:

make streamlit

The app should be opened in your browser automatically. You can also open it by visiting http://localhost:8501. For the app to work, you need to download the artifacts of an experiment run (see above) and have the API up and running.

To create a Docker image for the API:

make docker

Acknowledgement

This project is inspired by the project ideas section in the final project guidelines of the course Full Stack Deep Learning at UC Berkely.
MLOps - Made with ML for introducing Makefile, pre-commit, Github Actions and Python packaging.
harvardnlp/im2markup for pre-processing the im2latex-100k dataset.

Comments

Failed to install editdistance library in Win11

  Building wheel for editdistance (setup.py) ... error
  ERROR: Command errored out with exit status 1:
   command: 'D:\Applications\WPy64-3850\python-3.8.5.amd64\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'E:\\Temp\\pip-install-lr00wmov\\editdistance\\setup.py'"'"'; __file__='"'"'E:\\Temp\\pip-install-lr00wmov\\editdistance\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d 'E:\Temp\pip-wheel-uujy4pu5'
       cwd: E:\Temp\pip-install-lr00wmov\editdistance\
  Complete output (31 lines):
  running bdist_wheel
  running build
  running build_py
  creating build
  creating build\lib.win-amd64-3.8
  creating build\lib.win-amd64-3.8\editdistance
  copying editdistance\__init__.py -> build\lib.win-amd64-3.8\editdistance
  copying editdistance\_editdistance.h -> build\lib.win-amd64-3.8\editdistance
  copying editdistance\def.h -> build\lib.win-amd64-3.8\editdistance
  running build_ext
  building 'editdistance.bycython' extension
  creating build\temp.win-amd64-3.8
  creating build\temp.win-amd64-3.8\Release
  creating build\temp.win-amd64-3.8\Release\editdistance
  C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30037\bin\HostX86\x64\cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MD -I./editdistance -ID:\Applications\WPy64-3850\python-3.8.5.amd64\include -ID:\Applications\WPy64-3850\python-3.8.5.amd64\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30037\ATLMFC\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30037\include" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" "-IE:\Windows Kits\10\include\10.0.19041.0\ucrt" "-IE:\Windows Kits\10\include\10.0.19041.0\shared" "-IE:\Windows Kits\10\include\10.0.19041.0\um" "-IE:\Windows Kits\10\include\10.0.19041.0\winrt" "-IE:\Windows Kits\10\include\10.0.19041.0\cppwinrt" "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30037\include" "-IE:\Windows Kits\10\Include\10.0.19041.0\ucrt" "-IE:\Windows Kits\10\Include\10.0.19041.0\um" "-IE:\Windows Kits\10\Include\10.0.19041.0\shared" "-IE:\Windows Kits\10\Include\10.0.19041.0\winrt" /EHsc /Tpeditdistance/_editdistance.cpp /Fobuild\temp.win-amd64-3.8\Release\editdistance/_editdistance.obj
  _editdistance.cpp
  editdistance/_editdistance.cpp(1): warning C4819: 该文件包含不能在当前代码页(936)中表示 的字符。请将该文件保存为 Unicode 格式以防止数据丢失
  editdistance/_editdistance.cpp(117): error C2059: 语法错误:“if”
  editdistance/_editdistance.cpp(118): error C2059: 语法错误:“else”
  editdistance/_editdistance.cpp(119): error C2059: 语法错误:“else”
  editdistance/_editdistance.cpp(120): error C2059: 语法错误:“else”
  editdistance/_editdistance.cpp(121): error C2059: 语法错误:“else”
  editdistance/_editdistance.cpp(122): error C2059: 语法错误:“else”
  editdistance/_editdistance.cpp(123): error C2059: 语法错误:“else”
  editdistance/_editdistance.cpp(124): error C2059: 语法错误:“else”
  editdistance/_editdistance.cpp(125): error C2059: 语法错误:“else”
  editdistance/_editdistance.cpp(126): error C2059: 语法错误:“else”
  editdistance/_editdistance.cpp(127): error C2059: 语法错误:“return”
  editdistance/_editdistance.cpp(128): error C2059: 语法错误:“}”
  editdistance/_editdistance.cpp(128): error C2143: 语法错误: 缺少“;”(在“}”的前面)
  error: command 'C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Community\\VC\\Tools\\MSVC\\14.29.30037\\bin\\HostX86\\x64\\cl.exe' failed with exit status 2
  ----------------------------------------
  ERROR: Failed building wheel for editdistance

opened by ouening 2

I can't get good results from the API

I got CER of 0.06 on im2latex-100K. But I can't get good results from the API. Please give me some advice.

ls -l checkpoints

drwxr-xr-x 3 root root 4096 Apr 13 15:40 'epoch=1-val' drwxr-xr-x 3 root root 4096 Apr 13 15:54 'epoch=3-val' drwxr-xr-x 3 root root 4096 Apr 13 16:07 'epoch=5-val' drwxr-xr-x 3 root root 4096 Apr 13 16:21 'epoch=7-val' drwxr-xr-x 3 root root 4096 Apr 13 16:34 'epoch=9-val' drwxr-xr-x 3 root root 4096 Apr 13 16:48 'epoch=11-val' drwxr-xr-x 3 root root 4096 Apr 13 17:01 'epoch=13-val'

ls -l epoch=13-val

drwxr-xr-x 2 root root 4096 Apr 15 09:42 'loss=0.12-val'

ls -l loss=0.12-val

-rw-r--r-- 1 root root 2062314067 Apr 13 17:01 'cer=0.06.ckpt'

vi test_predictions.txt

\alpha _ { 1 } ^ { r } \gamma _ { 1 } + \ldots + \alpha _ { N } ^ { r } \gamma _ { N } = 0 \quad ( r = 1 , . . . , R ) \ , \eta = - \frac { 1 } { 2 } \operatorname { l n } ( \frac { \operatorname { c o s h } ( \sqrt { 2 } b _ { \infty } \sqrt { 1 + \alpha ^ { 2 } } y - \operatorname { a r c s i n h } \alpha ) } { \sqrt { 1 + \alpha ^ { 2 } } } } } ) P _ { ( 2 ) } ^ { - } = \int \beta d \beta d ^ { 9 } p d ^ { 8 } \lambda \Phi ( - p , - \lambda ) ( - \frac { p ^ { I } p ^ { I } } { 2 \beta } ) \Phi ( p , \lambda ) \Phi ( p , \lambda ) \Phi ( p , \lambda ) \Phi ( p , \lambda ) \Phi ( p , \lambda ) \Phi ( p , \lambda ) \Phi ( p , \lambda ) \Phi ( p , \lambda ) \Phi ( p , \lambda ) \Phi ( p , \lambda ) \Phi ( p , \lambda ) \Phi ( p , \lambda ) \Phi ( p , \lambda ) \Phi ( p , \lambda ) \Phi ( p , \Gamma ( z + 1 ) = \int _ { 0 } ^ { \infty } ; d x ; e ^ { - x } x ^ { z } . \frac { d } { d s } { \bf C } _ { i } = \frac { 1 } { 2 } \epsilon _ { i j k } { \bf C } _ { j } \times { \bf C } _ { k } , .

API Test

curl -X 'POST' 'http://0.0.0.0:8000/predict/' -H 'accept: application/json' -H 'Content-Type: multipart/form-data' -F '[email protected];type=image/png'

{"message":"OK","status-code":200,"data":{"pred":"E = m c ^ { 2 } \qquad \qquad E = m c ^ { 2 } \qquad E = m c ^ { 2 }"}}

curl -X 'POST' 'http://0.0.0.0:8000/predict/' -H 'accept: application/json' -H 'Content-Type: multipart/form-data' -F '[email protected];type=image/png'

{"message":"OK","status-code":200,"data":{"pred":"{ I } _ { \mathrm { I } } { \mit { U } } } ( { \bf { I } } { \bf { U } } ) } ) \; \; \Longrightarrow \; { \bf { U } } { \binom { \bf { I } } } { { \bf { H } } } ) \; { \frac { 1 } { \longrightarrow } } \; { \bf { U } } } { \bf { I } } } \Bigg \{ { \bf { I } } } { \bf { I } } ) \; { \bf { T } } \; { \bf { U } } { \bf { U } } \; { \bf { U } } { \bf { U } } } \; { \bf { U } } { \bf { U } } { \bf"}}

curl -X 'POST' 'http://0.0.0.0:8000/predict/' -H 'accept: application/json' -H 'Content-Type: multipart/form-data' -F '[email protected];type=image/png'

{"message":"OK","status-code":200,"data":{"pred":"{ \bf { I } } ~ \longrightarrow ~ { \bf { I } } _ { I } \prod _ { I } \bigoplus _ { I } \bigoplus _ { \bf { I } } \bigoplus _ { \bf { O } } { \bf { I } } { \bf { I } } } { \bf { I } } { \bf { I } } { \bf { I } } } { \bf { I } } } { \bf { I } } { \bf { I } } } { \bf { I } } { \bf { I } } { \bf { I } } } { \bf { I } } } { \bf { I } } } { \bf { I } } } } { \bf { I } } } { \bf { I } }"}}

curl -X 'POST' 'http://0.0.0.0:8000/predict/' -H 'accept: application/json' -H 'Content-Type: multipart/form-data' -F '[email protected];type=image/png'

{"message":"OK","status-code":200,"data":{"pred":"\begin{array} { c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c"}}

opened by pdc-kaminaga 1

train error with self created data

/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [37,0,0], thread: [58,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [37,0,0], thread: [59,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [37,0,0], thread: [60,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [37,0,0], thread: [61,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [37,0,0], thread: [62,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [37,0,0], thread: [63,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [37,0,0], thread: [63,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
Error executing job with overrides: ['trainer.gpus=1', 'data.batch_size=8']
Traceback (most recent call last):
  File "run_experiment.py", line 42, in <module>
    main()
  File "/home/nd/anaconda3/envs/img2latex/lib/python3.8/site-packages/hydra/main.py", line 49, in decorated_main
    _run_hydra(
  File "/home/nd/anaconda3/envs/img2latex/lib/python3.8/site-packages/hydra/_internal/utils.py", line 367, in _run_hydra
    run_and_report(
  File "/home/nd/anaconda3/envs/img2latex/lib/python3.8/site-packages/hydra/_internal/utils.py", line 214, in run_and_report
    raise ex
  File "/home/nd/anaconda3/envs/img2latex/lib/python3.8/site-packages/hydra/_internal/utils.py", line 211, in run_and_report
    return func()
  File "/home/nd/anaconda3/envs/img2latex/lib/python3.8/site-packages/hydra/_internal/utils.py", line 368, in <lambda>
    lambda: hydra.run(
  File "/home/nd/anaconda3/envs/img2latex/lib/python3.8/site-packages/hydra/_internal/hydra.py", line 110, in run
    _ = ret.return_value
  File "/home/nd/anaconda3/envs/img2latex/lib/python3.8/site-packages/hydra/core/utils.py", line 233, in return_value
    raise self._return_value
  File "/home/nd/anaconda3/envs/img2latex/lib/python3.8/site-packages/hydra/core/utils.py", line 160, in run_job
    ret.return_value = task_function(task_cfg)
  File "run_experiment.py", line 36, in main
    trainer.tune(lit_model, datamodule=datamodule)
  File "/home/nd/anaconda3/envs/img2latex/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 688, in tune
    result = self.tuner._tune(model, scale_batch_size_kwargs=scale_batch_size_kwargs, lr_find_kwargs=lr_find_kwargs)
  File "/home/nd/anaconda3/envs/img2latex/lib/python3.8/site-packages/pytorch_lightning/tuner/tuning.py", line 54, in _tune
    result['lr_find'] = lr_find(self.trainer, model, **lr_find_kwargs)
  File "/home/nd/anaconda3/envs/img2latex/lib/python3.8/site-packages/pytorch_lightning/tuner/lr_finder.py", line 250, in lr_find
    trainer.tuner._run(model)
  File "/home/nd/anaconda3/envs/img2latex/lib/python3.8/site-packages/pytorch_lightning/tuner/tuning.py", line 64, in _run
    self.trainer._run(*args, **kwargs)
  File "/home/nd/anaconda3/envs/img2latex/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 758, in _run
    self.dispatch()
  File "/home/nd/anaconda3/envs/img2latex/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 799, in dispatch
    self.accelerator.start_training(self)
  File "/home/nd/anaconda3/envs/img2latex/lib/python3.8/site-packages/pytorch_lightning/accelerators/accelerator.py", line 96, in start_training
    self.training_type_plugin.start_training(trainer)
  File "/home/nd/anaconda3/envs/img2latex/lib/python3.8/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 144, in start_training
    self._results = trainer.run_stage()
  File "/home/nd/anaconda3/envs/img2latex/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 809, in run_stage
    return self.run_train()
  File "/home/nd/anaconda3/envs/img2latex/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 871, in run_train
    self.train_loop.run_training_epoch()
  File "/home/nd/anaconda3/envs/img2latex/lib/python3.8/site-packages/pytorch_lightning/trainer/training_loop.py", line 499, in run_training_epoch
    batch_output = self.run_training_batch(batch, batch_idx, dataloader_idx)
  File "/home/nd/anaconda3/envs/img2latex/lib/python3.8/site-packages/pytorch_lightning/trainer/training_loop.py", line 738, in run_training_batch
    self.optimizer_step(optimizer, opt_idx, batch_idx, train_step_and_backward_closure)
  File "/home/nd/anaconda3/envs/img2latex/lib/python3.8/site-packages/pytorch_lightning/trainer/training_loop.py", line 434, in optimizer_step
    model_ref.optimizer_step(
  File "/home/nd/anaconda3/envs/img2latex/lib/python3.8/site-packages/pytorch_lightning/core/lightning.py", line 1403, in optimizer_step
    optimizer.step(closure=optimizer_closure)
  File "/home/nd/anaconda3/envs/img2latex/lib/python3.8/site-packages/pytorch_lightning/core/optimizer.py", line 214, in step
    self.__optimizer_step(*args, closure=closure, profiler_name=profiler_name, **kwargs)
  File "/home/nd/anaconda3/envs/img2latex/lib/python3.8/site-packages/pytorch_lightning/core/optimizer.py", line 134, in __optimizer_step
    trainer.accelerator.optimizer_step(optimizer, self._optimizer_idx, lambda_closure=closure, **kwargs)
  File "/home/nd/anaconda3/envs/img2latex/lib/python3.8/site-packages/pytorch_lightning/accelerators/accelerator.py", line 329, in optimizer_step
    self.run_optimizer_step(optimizer, opt_idx, lambda_closure, **kwargs)
  File "/home/nd/anaconda3/envs/img2latex/lib/python3.8/site-packages/pytorch_lightning/accelerators/accelerator.py", line 336, in run_optimizer_step
    self.training_type_plugin.optimizer_step(optimizer, lambda_closure=lambda_closure, **kwargs)
  File "/home/nd/anaconda3/envs/img2latex/lib/python3.8/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 193, in optimizer_step
    optimizer.step(closure=lambda_closure, **kwargs)
  File "/home/nd/anaconda3/envs/img2latex/lib/python3.8/site-packages/torch/optim/lr_scheduler.py", line 65, in wrapper
    return wrapped(*args, **kwargs)
  File "/home/nd/anaconda3/envs/img2latex/lib/python3.8/site-packages/torch/optim/optimizer.py", line 88, in wrapper
    return func(*args, **kwargs)
  File "/home/nd/anaconda3/envs/img2latex/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
    return func(*args, **kwargs)
  File "/home/nd/anaconda3/envs/img2latex/lib/python3.8/site-packages/torch/optim/adamw.py", line 65, in step
    loss = closure()
  File "/home/nd/anaconda3/envs/img2latex/lib/python3.8/site-packages/pytorch_lightning/trainer/training_loop.py", line 732, in train_step_and_backward_closure
    result = self.training_step_and_backward(
  File "/home/nd/anaconda3/envs/img2latex/lib/python3.8/site-packages/pytorch_lightning/trainer/training_loop.py", line 823, in training_step_and_backward
    result = self.training_step(split_batch, batch_idx, opt_idx, hiddens)
  File "/home/nd/anaconda3/envs/img2latex/lib/python3.8/site-packages/pytorch_lightning/trainer/training_loop.py", line 290, in training_step
    training_step_output = self.trainer.accelerator.training_step(args)
  File "/home/nd/anaconda3/envs/img2latex/lib/python3.8/site-packages/pytorch_lightning/accelerators/accelerator.py", line 204, in training_step
    return self.training_type_plugin.training_step(*args)
  File "/home/nd/anaconda3/envs/img2latex/lib/python3.8/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 155, in training_step
    return self.lightning_module.training_step(*args, **kwargs)
  File "/home/nd/PycharmProjects/imagetolatex/image_to_latex/lit_models/lit_resnet_transformer.py", line 55, in training_step
    logits = self.model(imgs, targets[:, :-1])
  File "/home/nd/anaconda3/envs/img2latex/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/nd/PycharmProjects/imagetolatex/image_to_latex/models/resnet_transformer.py", line 88, in forward
    output = self.decode(y, encoded_x)  # (Sy, B, num_classes)
  File "/home/nd/PycharmProjects/imagetolatex/image_to_latex/models/resnet_transformer.py", line 122, in decode
    y = self.embedding(y) * math.sqrt(self.d_model)  # (Sy, B, E)
  File "/home/nd/anaconda3/envs/img2latex/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/nd/anaconda3/envs/img2latex/lib/python3.8/site-packages/torch/nn/modules/sparse.py", line 158, in forward
    return F.embedding(
  File "/home/nd/anaconda3/envs/img2latex/lib/python3.8/site-packages/torch/nn/functional.py", line 2043, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: CUDA error: device-side assert triggered

opened by iamyangjy 1

can not download the dataset

it seems something wrong in process Data Preprocessing, can not download dateset in url "http://lstm.seas.harvard.edu/latex/data/formula_images.tar.gz"

opened by BigF25 0
CVE-2007-4559 Patch

Patching CVE-2007-4559

Hi, we are security researchers from the Advanced Research Center at Trellix. We have began a campaign to patch a widespread bug named CVE-2007-4559. CVE-2007-4559 is a 15 year old bug in the Python tarfile package. By using extract() or extractall() on a tarfile object without sanitizing input, a maliciously crafted .tar file could perform a directory path traversal attack. We found at least one unsantized extractall() in your codebase and are providing a patch for you via pull request. The patch essentially checks to see if all tarfile members will be extracted safely and throws an exception otherwise. We encourage you to use this patch or your own solution to secure against CVE-2007-4559. Further technical information about the vulnerability can be found in this blog.

If you have further questions you may contact us through this projects lead researcher Kasimir Schulz.

opened by TrellixVulnTeam 0
Error: virtual environment named venv
make : The term 'make' is not recognized as the name of a cmdlet, function, script file, or operable program. Check the spelling of the name, or if a path was included, verify that the path is correct and try again. At line:1 char:1

make venv

+ CategoryInfo : ObjectNotFound: (make:String) [], CommandNotFoundException + FullyQualifiedErrorId : CommandNotFoundException
opened by amalrajjs 0

`make venv` fails: `error: Multiple top-level packages discovered in a flat-layout: ['api', 'conf', 'figures', 'streamlit', 'image_to_latex']`

 ❯ make venv
python3 -m venv venv
source venv/bin/activate && \
	python -m pip install --upgrade pip setuptools wheel && \
	make install-dev
Requirement already satisfied: pip in ./venv/lib/python3.10/site-packages (22.2.2)
Requirement already satisfied: setuptools in ./venv/lib/python3.10/site-packages (64.0.3)
Requirement already satisfied: wheel in ./venv/lib/python3.10/site-packages (0.37.1)
python -m pip install -e ".[dev]" --no-cache-dir
Obtaining file:///Users/evar/Base/_Code/uni/image-to-latex
  Installing build dependencies ... done
  Checking if build backend supports build_editable ... done
  Getting requirements to build editable ... error
  error: subprocess-exited-with-error
  
  × Getting requirements to build editable did not run successfully.
  │ exit code: 1
  ╰─> [14 lines of output]
      error: Multiple top-level packages discovered in a flat-layout: ['api', 'conf', 'figures', 'streamlit', 'image_to_latex'].
      
      To avoid accidental inclusion of unwanted files or directories,
      setuptools will not proceed with this build.
      
      If you are trying to create a single distribution with multiple packages
      on purpose, you should not rely on automatic discovery.
      Instead, consider the following options:
      
      1. set up custom discovery (`find` directive with `include` or `exclude`)
      2. use a `src-layout`
      3. explicitly set `py_modules` or `packages` with a list of names
      
      To find more information, look for "package discovery" on setuptools docs.
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× Getting requirements to build editable did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
make[1]: *** [install-dev] Error 1
make: *** [venv] Error 2

opened by NightMachinery 3

Suggest to loosen the dependency on albumentations

Hi, your project image-to-latex requires "albumentations==1.0.3" in its dependency. After analyzing the source code, we found that the following versions of albumentations can also be suitable without affecting your project, i.e., albumentations 1.0.0, 1.0.1, 1.0.2. Therefore, we suggest to loosen the dependency on albumentations from "albumentations==1.0.3" to "albumentations>=1.0.0,<=1.0.3" to avoid any possible conflict for importing more packages or for downstream projects that may use image-to-latex.

May I pull a request to further loosen the dependency on albumentations?

By the way, could you please tell us whether such dependency analysis may be potentially helpful for maintaining dependencies easier during your development?

We also give our detailed analysis as follows for your reference:

Your project image-to-latex directly uses 5 APIs from package albumentations.

albumentations.augmentations.geometric.transforms.Affine.__init__, albumentations.pytorch.transforms.ToTensorV2.__init__, albumentations.core.composition.Compose.__init__, albumentations.augmentations.transforms.GaussianBlur.__init__, albumentations.augmentations.transforms.GaussNoise.__init__

Beginning from the 5 APIs above, 15 functions are then indirectly called, including 14 albumentations's internal APIs and 1 outsider APIs. The specific call graph is listed as follows (neglecting some repeated function occurrences).

[/kingyiusuen/image-to-latex]
+--albumentations.augmentations.geometric.transforms.Affine.__init__
|      +--albumentations.core.transforms_interface.BasicTransform.__init__
|      +--albumentations.augmentations.geometric.transforms.Affine._handle_dict_arg
|      |      +--albumentations.core.transforms_interface.to_tuple
|      +--albumentations.augmentations.geometric.transforms.Affine._handle_translate_arg
|      +--albumentations.core.transforms_interface.to_tuple
+--albumentations.pytorch.transforms.ToTensorV2.__init__
|      +--albumentations.core.transforms_interface.BasicTransform.__init__
+--albumentations.core.composition.Compose.__init__
|      +--albumentations.core.composition.BaseCompose.__init__
|      |      +--albumentations.core.composition.Transforms.__init__
|      |      |      +--albumentations.core.composition.Transforms._find_dual_start_end
|      |      |      |      +--albumentations.core.composition.Transforms._find_dual_start_end
|      +--albumentations.augmentations.bbox_utils.BboxProcessor.__init__
|      |      +--albumentations.core.utils.DataProcessor.__init__
|      +--albumentations.core.composition.BboxParams.__init__
|      |      +--albumentations.core.utils.Params.__init__
|      +--albumentations.augmentations.keypoints_utils.KeypointsProcessor.__init__
|      |      +--albumentations.core.utils.DataProcessor.__init__
|      +--albumentations.core.composition.KeypointParams.__init__
|      |      +--albumentations.core.utils.Params.__init__
|      +--albumentations.core.composition.BaseCompose.add_targets
+--albumentations.augmentations.transforms.GaussianBlur.__init__
|      +--albumentations.core.transforms_interface.BasicTransform.__init__
|      +--albumentations.core.transforms_interface.to_tuple
|      +--warnings.warn
+--albumentations.augmentations.transforms.GaussNoise.__init__
|      +--albumentations.core.transforms_interface.BasicTransform.__init__

We scan albumentations's versions and observe that during its evolution between any version from [1.0.0, 1.0.1, 1.0.2] and 1.0.3, the changing functions (diffs being listed below) have none intersection with any function or API we mentioned above (either directly or indirectly called by this project).

diff: 1.0.3(original) 1.0.0
['albumentations.core.composition.Compose._check_data_post_transform', 'albumentations.core.utils.DataProcessor.postprocess', 'albumentations.augmentations.transforms.PadIfNeeded.update_params', 'albumentations.core.composition.Compose.__call__', 'albumentations.core.composition.Compose', 'albumentations.core.utils.get_shape', 'albumentations.augmentations.transforms.Normalize', 'albumentations.augmentations.geometric.transforms.Affine', 'albumentations.augmentations.crops.transforms.CropAndPad._get_px_params', 'albumentations.augmentations.crops.transforms.CropAndPad', 'albumentations.augmentations.transforms.Superpixels', 'albumentations.augmentations.transforms.PadIfNeeded', 'albumentations.augmentations.bbox_utils.convert_bbox_from_albumentations', 'albumentations.core.utils.DataProcessor', 'albumentations.augmentations.transforms.PadIfNeeded.PositionType', 'albumentations.augmentations.transforms.PadIfNeeded.__update_position_params', 'albumentations.augmentations.bbox_utils.convert_bbox_to_albumentations', 'albumentations.augmentations.transforms.PadIfNeeded.__init__', 'albumentations.augmentations.bbox_utils.check_bbox']

diff: 1.0.3(original) 1.0.1
['albumentations.core.composition.Compose.__call__', 'albumentations.core.composition.Compose', 'albumentations.core.composition.Compose._check_data_post_transform', 'albumentations.augmentations.bbox_utils.convert_bbox_to_albumentations', 'albumentations.augmentations.transforms.Superpixels', 'albumentations.core.utils.DataProcessor.postprocess', 'albumentations.core.utils.get_shape', 'albumentations.augmentations.bbox_utils.convert_bbox_from_albumentations', 'albumentations.core.utils.DataProcessor', 'albumentations.augmentations.bbox_utils.check_bbox']

diff: 1.0.3(original) 1.0.2
['albumentations.core.composition.Compose', 'albumentations.core.composition.Compose._check_data_post_transform', 'albumentations.augmentations.transforms.Superpixels', 'albumentations.core.utils.DataProcessor.postprocess', 'albumentations.core.utils.get_shape', 'albumentations.core.utils.DataProcessor', 'albumentations.augmentations.bbox_utils.check_bbox']

As for other packages, the APIs of warnings are called by albumentations in the call graph and the dependencies on these packages also stay the same in our suggested versions, thus avoiding any outside conflict.

Therefore, we believe that it is quite safe to loose your dependency on albumentations from "albumentations==1.0.3" to "albumentations>=1.0.0,<=1.0.3". This will improve the applicability of image-to-latex and reduce the possibility of any further dependency conflict with other projects.

opened by Agnes-U 0

Owner

GitHub

pix2tex: Using a ViT to convert images of equations into LaTeX code.

The goal of this project is to create a learning based system that takes an image of a math formula and returns corresponding LaTeX code.

2.6k Dec 30, 2022

Visage Differentiation is a GUI application for outlining and labeling the visages in an image.

Visage Differentiation Visage Differentiation is a GUI application for outlining and labeling the visages in an image. The main functionality is provi

0 Jan 13, 2022

QR code python application which can read(decode) and generate(encode) QR codes.

QR Code Application This is a basic QR Code application. Using this application you can generate QR code for you text/links. Using this application yo

1 Aug 9, 2022

An open source image editor which can manipulate an image in many ways!

Image Editor - An open source image editor which can manipulate an image in many ways! If you need any more modes in repo or I

44 Nov 17, 2022

Image enhancing model for making a blurred image to be somehow clearer than before

This is a very small prject which helps in enhancing the images by taking a Input images. This project has many features like detcting the faces and enhaning the faces itself and also a feature which enhances the whole image

3 Dec 3, 2021

Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.

img2dataset Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine. Also supports

1.4k Jan 1, 2023

Nanosensor Image Processor (NanoImgPro), a python-based image analysis tool for dopamine nanosensors

NanoImgPro Nanosensor Image Processor (NanoImgPro), a python-based image analysis tool for dopamine nanosensors NanoImgPro.py contains the main class

1 Mar 2, 2022

A pure python implementation of the GIMP XCF image format. Use this to interact with GIMP image formats

Pure Python implementation of the GIMP image formats (.xcf projects as well as brushes, patterns, etc)

8 Dec 30, 2022

Image-Viewer is a Windows image viewer based on Python 3.

Image-Viewer Hi! Image-Viewer is a Windows image viewer based on Python 3. Using You must download Image-Viewer.exe from the root of the repository. T

2 Apr 18, 2022

Image Reading, Metadata Conversion, and Image Writing for Microscopy Images in Python

AICSImageIO Image Reading, Metadata Conversion, and Image Writing for Microscopy Images in Pure Python Features Supports reading metadata and imaging

Allen Institute for Cell Science - Modeling

137 Dec 14, 2022

Seaborn-image is a Python image visualization library based on matplotlib and provides a high-level API to draw attractive and informative images quickly and effectively.

seaborn-image: image data visualization Description Seaborn-image is a Python image visualization library based on matplotlib and provides a high-leve

48 Jan 5, 2023

This app finds duplicate to near duplicate images by generating a hash value for each image stored with a specialized data structure called VP-Tree which makes searching an image on a dataset of 100Ks almost instantanious

Offline Reverse Image Search Overview This app finds duplicate to near duplicate images by generating a hash value for each image stored with a specia

53 Nov 15, 2022

Quickly 'anonymize' all people in an image. This script will put a black bar over all eye-pairs in an image

Face-Detacher Quickly 'anonymize' all people in an image. This script will put a black bar over all eye-pairs in an image This is a small python scrip

1 Oct 29, 2021

Fast Image Retrieval is an open source image retrieval framework

Fast Image Retrieval is an open source image retrieval framework release by Center of Image and Signal Processing Lab (CISiP Lab), Universiti Malaya. This framework implements most of the major binary hashing methods, together with both popular backbone networks and public datasets.

39 Nov 25, 2022

Fast Image Retrieval (FIRe) is an open source image retrieval project

Fast Image Retrieval (FIRe) is an open source image retrieval project release by Center of Image and Signal Processing Lab (CISiP Lab), Universiti Malaya. This project implements most of the major binary hashing methods to date, together with different popular backbone networks and public datasets.

39 Nov 25, 2022

A Python Script to convert Normal PNG Image to Apple iDOT PNG Image.

idot-png-encoder A Python Script to convert Normal PNG Image to Apple iDOT PNG Image (Multi-threaded Decoding PNG). Usage idotpngencoder.py -i <inputf

2 Feb 17, 2022

Anaglyph 3D Converter - A python script that adds a 3D anaglyph style effect to an image using the Pillow image processing package.

2 Jan 22, 2022

Pyconvert is a python script that you can use to convert image files to another image format! (eg. PNG to ICO)

1 Jan 16, 2022

Simple Python package to convert an image into a quantized image using a customizable palette

Simple Python package to convert an image into a quantized image using a customizable palette. Resulting image can be displayed by ePaper displays such as Waveshare displays.

3 Apr 13, 2022

An application that maps an image of a LaTeX math equation to LaTeX code.

Related tags

Overview

Image to LaTeX

Introduction

How To Use

Setup

Data Preprocessing

Model Training and Experiment Tracking

Model Training

Experiment Tracking using Weights & Biases

Testing and Continuous Integration

Deployment

Acknowledgement

Comments

ls -l checkpoints

ls -l epoch=13-val

ls -l loss=0.12-val

vi test_predictions.txt

API Test

curl -X 'POST' 'http://0.0.0.0:8000/predict/' -H 'accept: application/json' -H 'Content-Type: multipart/form-data' -F '[email protected];type=image/png'

curl -X 'POST' 'http://0.0.0.0:8000/predict/' -H 'accept: application/json' -H 'Content-Type: multipart/form-data' -F '[email protected];type=image/png'

curl -X 'POST' 'http://0.0.0.0:8000/predict/' -H 'accept: application/json' -H 'Content-Type: multipart/form-data' -F '[email protected];type=image/png'

curl -X 'POST' 'http://0.0.0.0:8000/predict/' -H 'accept: application/json' -H 'Content-Type: multipart/form-data' -F '[email protected];type=image/png'

Patching CVE-2007-4559

Owner

pix2tex: Using a ViT to convert images of equations into LaTeX code.

Visage Differentiation is a GUI application for outlining and labeling the visages in an image.

QR code python application which can read(decode) and generate(encode) QR codes.

An open source image editor which can manipulate an image in many ways!

Image enhancing model for making a blurred image to be somehow clearer than before

Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.

Nanosensor Image Processor (NanoImgPro), a python-based image analysis tool for dopamine nanosensors

A pure python implementation of the GIMP XCF image format. Use this to interact with GIMP image formats

Image-Viewer is a Windows image viewer based on Python 3.

Image Reading, Metadata Conversion, and Image Writing for Microscopy Images in Python

Seaborn-image is a Python image visualization library based on matplotlib and provides a high-level API to draw attractive and informative images quickly and effectively.

This app finds duplicate to near duplicate images by generating a hash value for each image stored with a specialized data structure called VP-Tree which makes searching an image on a dataset of 100Ks almost instantanious

Quickly 'anonymize' all people in an image. This script will put a black bar over all eye-pairs in an image

Fast Image Retrieval is an open source image retrieval framework

Fast Image Retrieval (FIRe) is an open source image retrieval project

A Python Script to convert Normal PNG Image to Apple iDOT PNG Image.

Anaglyph 3D Converter - A python script that adds a 3D anaglyph style effect to an image using the Pillow image processing package.

Pyconvert is a python script that you can use to convert image files to another image format! (eg. PNG to ICO)

Simple Python package to convert an image into a quantized image using a customizable palette