FOTS Pytorch Implementation

Ning Lu

Last update: Dec 19, 2022

Related tags

Overview

News!!! Recognition branch now is added into model. The whole project has beed optimized and refactored.

Introduction

This is a PyTorch implementation of FOTS.

Instruction

Requirements

build tools
```
./build.sh
```
prepare Dataset
create virtual env, you may need conda

conda create --name fots --file spec-file.txt
conda activate fots
pip install -r reqs.txt

Training

# quite easy, for single gpu training set gpus to [0]. 0 is the id of your gpu.
python train.py -c pretrain.json
python train.py -c finetune.json

Evaluation

python eval.py -m <model.tar.gz> -i <input_images_folder> -o <output_folders>

Benchmarking and Models (Coming soon!)

Acknowledgement

https://github.com/SakuraRiven/EAST (Some codes are copied from here.)
https://github.com/chenjun2hao/FOTS.pytorch.git (ROIRotate)

Comments

RuntimeError: Cannot compile lanms??

make: 进入目录“/home/rwd/graduate/code/fots.PyTorch-master/utils/lanms” g++ -o adaptor.so -I include -std=c++11 -O3 -I/home/rwd/anaconda2/envs/py351/include/python3.5m -I/home/rwd/anaconda2/envs/py351/include/python3.5m -Wno-unused-result -Wsign-compare -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O3 -pipe -fdebug-prefix-map==/usr/local/src/conda/- -fdebug-prefix-map==/usr/local/src/conda-prefix -fuse-linker-plugin -ffat-lto-objects -flto-partition=none -flto -DNDEBUG -fwrapv -O3 -Wall -Wstrict-prototypes -L/home/rwd/anaconda2/envs/py351/lib/python3.5/config-3.5m -L/home/rwd/anaconda2/envs/py351/lib -lpython3.5m -lpthread -ldl -lutil -lrt -lm -Xlinker -export-dynamic adaptor.cpp include/clipper/clipper.cpp --shared -fPIC g++: 错误：unrecognized command line option ‘-fno-plt’ make: *** [adaptor.so] 错误 1 make: 离开目录“/home/rwd/graduate/code/fots.PyTorch-master/utils/lanms” Traceback (most recent call last): File "train.py", line 14, in from trainer import Trainer File "/home/rwd/graduate/code/fots.PyTorch-master/trainer/init.py", line 1, in from .trainer import * File "/home/rwd/graduate/code/fots.PyTorch-master/trainer/trainer.py", line 4, in from utils.bbox import Toolbox File "/home/rwd/graduate/code/fots.PyTorch-master/utils/bbox.py", line 8, in from . import lanms File "/home/rwd/graduate/code/fots.PyTorch-master/utils/lanms/init.py", line 8, in raise RuntimeError('Cannot compile lanms: {}'.format(BASE_DIR)) RuntimeError: Cannot compile lanms: /home/rwd/graduate/code/fots.PyTorch-master/utils/lanms

opened by monologue1107 15
Loss: Nan
用的是SynthText数据集，训练时出现：

RankWarning: Polyfit may be poorly conditioned edge = fit_line([p0[0], p1[0]], [p0[1], p1[1]])

上面的警告不影响训练，训练一段时间后loss出现Nan值。我并没有执行build.sh这个程序，因为学校集群环境中没有NVCC。看其他问题中提到好像也不需要编译它。希望问题能得到解决。
opened by MiTeng0215 8
can't save the model

Hi, many thanks to your great contribution! When I finished training and save the model using "torch.save()", I get the follow problem: [can't pickle _thread.RLock objects] I use python3.6 and torch 1.0.0, it seems that something wrong with the model structure? Do you have any suggestion? thanks for your reply!

opened by SherryShall 2

请教一下ROIrotate部分的代码

不是很明白下面这个函数的作用，可以简单讲解一下吗？ def param2theta(param, w, h): param = np.vstack([param, [0, 0, 1]]) param = np.linalg.inv(param)

    theta = np.zeros([2, 3])
    theta[0, 0] = param[0, 0]
    theta[0, 1] = param[0, 1] * h / w
    theta[0, 2] = param[0, 2] * 2 / w + theta[0, 0] + theta[0, 1] - 1
    theta[1, 0] = param[1, 0] * w / h
    theta[1, 1] = param[1, 1]
    theta[1, 2] = param[1, 2] * 2 / h + theta[1, 0] + theta[1, 1] - 1
    return theta

opened by Fighting-JJ 2

How to understand the Eqn. (4-8) in the FOTS paper

Hi @jiangxiluning thank you for sharing the code. I have recently read the paper but I cannot understand the Eqn. (4-8) in the FOTS paper. The equations are as follows:

From them, to compute the M matrix, we need to compute tx, ty, which are both related to the x,y position (from Eqn 4 and 5). So does that mean there is one M matrix for each point (x,y) in the input feature map? But as far as I know, for one bounding box, we just need one transformation matrix, right? Do I miss something or understand uncorrectly? Looking forward to your reply, thanks a lot!

opened by Remember2018 2
Documentation on preparing and training
@jiangxiluning thank you for your hard work.

Can you share some documentation on:

Preparing the training data.

How to train

How to use the model.

Thank you waiting for your reply
opened by ghost 2
Exception[pytorch_lightning]: You are trying to `self.log()` but it is not managed by the `Trainer` control flow

Hi, I've met a crash while running training model. There is a problem from pytorch_lightning, maybe. But I don't understand why 'self.log' cannot be used. Please help me to solve the problem.

I ran the project on the colab. The version of the pytorch_lightening is 1.5.

Traceback (most recent call last): File "train.py", line 99, in main(config, args.resume) File "train.py", line 75, in main trainer.fit(model=model, datamodule=data_module) File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 769, in fit self._fit_impl, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 719, in _call_and_handle_interrupt return self.strategy.launcher.launch(trainer_fn, *args, trainer=self, **kwargs) File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/strategies/launchers/subprocess_script.py", line 93, in launch return function(*args, **kwargs) File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 809, in _fit_impl results = self._run(model, ckpt_path=self.ckpt_path) File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 1234, in _run results = self._run_stage() File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 1321, in _run_stage return self._run_train() File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 1351, in _run_train self.fit_loop.run() File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/loops/base.py", line 204, in run self.advance(*args, **kwargs) File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/loops/fit_loop.py", line 268, in advance self._outputs = self.epoch_loop.run(self._data_fetcher) File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/loops/base.py", line 204, in run self.advance(*args, **kwargs) File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 246, in advance self.trainer._logger_connector.update_train_step_metrics() File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/connectors/logger_connector/logger_connector.py", line 197, in update_train_step_metrics self._log_gpus_metrics() File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/connectors/logger_connector/logger_connector.py", line 226, in _log_gpus_metrics key, mem, prog_bar=False, logger=True, on_step=True, on_epoch=False File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/core/lightning.py", line 386, in log "You are trying to self.log() but it is not managed by the Trainer control flow" pytorch_lightning.utilities.exceptions.MisconfigurationException: You are trying to self.log() but it is not managed by the Trainer control flow

opened by JANGSOONMYUN 1
fatal error: torch/extension.h: No such file or directory
When I run bash build.sh, I get

rroi_align_kernel.cu:1:10: fatal error: torch/extension.h: No such file or directory 1 | #include <torch/extension.h>

How to resolve this problem?
opened by limaries30 1

training with recall :0;hmean:0; precious:0

epoch          : 1
loss           : 0.03228356248388688
precious       : 0.0
recall         : 0.0
hmean          : 0.0
val_precious   : 0.0
val_recall     : 0.0
val_hmean      : 0.0

can someone help me

opened by suven2019 1

How i can get recognition result ?

I'm training and checking the results. Drawing a box on an image file is going well. I want to check the recognition result of the text, which code are checking the result?

opened by freegear 1
ModuleNotFoundError: No module named 'pretrainedmodels'

python eval.py Traceback (most recent call last): File "eval.py", line 10, in from model.model import FOTSModel File "/Users/zakj/Desktop/work/detection_fots/model/model.py", line 6, in import pretrainedmodels as pm ModuleNotFoundError: No module named 'pretrainedmodels'

opened by happog 1
error while "pip install -r reqs.txt"

ERROR: Could not install packages due to an OSError: [Errno 2] No such file or directory: '/home/conda/feedstock_root/build_artifacts/absl-py_1615404881292/work'

absl-py @ file:///home/conda/feedstock_root/build_artifacts/absl-py_1615404881292/work aiohttp==3.7.4.post0 async-timeout==3.0.1 attrs @ file:///tmp/build/80754af9/attrs_1604765588209/work

beacause of the "@ file:///home/conda/feedstock_root/build_artifacts/absl-py_1615404881292/work"?

opened by lingfengqiu 0
Compile successed but failed at import
ImportError: /opt/conda/lib/python3.6/site-packages/rotate_roi-0.0.0-py3.6-linux-x86_64.egg/rotated_roi.cpython-36m-x86_64-linux-gnu.so: undefined symbol: RROIAlignForwardLaucher

env:

`[07/26 03:17:38 LP]: Environment info:

sys.platform linux Python 3.6.9 |Anaconda, Inc.| (default, Jul 30 2019, 19:07:31) [GCC 7.3.0] numpy 1.18.1 PyTorch 1.6.0 @/opt/conda/lib/python3.6/site-packages/torch PyTorch debug build False GPU available True GPU 0,1,2 GeForce RTX 2080 Ti CUDA_HOME /usr/local/cuda TORCH_CUDA_ARCH_LIST 5.2 6.0 6.1 7.0 7.5+PTX Pillow 8.4.0 torchvision 0.7.0 @/opt/conda/lib/python3.6/site-packages/torchvision torchvision arch flags sm_35, sm_50, sm_60, sm_70, sm_75 cv2 3.4.1

PyTorch built with:

GCC 7.3

C++ Version: 201402

Intel(R) Math Kernel Library Version 2019.0.5 Product Build 20190808 for Intel(R) 64 architecture applications

Intel(R) MKL-DNN v1.5.0 (Git Hash e2ac1fac44c5078ca927cb9b90e1b3066a0b2ed0)

OpenMP 201511 (a.k.a. OpenMP 4.5)

NNPACK is enabled

CPU capability usage: AVX2

CUDA Runtime 10.2

NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75

CuDNN 7.6.5

Magma 2.5.2

Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_VULKAN_WRAPPER -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF, `
opened by tycallen 0

Owner

Ning Lu

CV and NLP 长期招聘 OCR 方向实习生，欢迎来撩。

GitHub

TensorFlow Implementation of FOTS, Fast Oriented Text Spotting with a Unified Network.

FOTS: Fast Oriented Text Spotting with a Unified Network I am still working on this repo. updates and detailed instructions are coming soon! Table of

52 Nov 11, 2022

Pytorch implementation of PSEnet with Pyramid Attention Network as feature extractor

Scene Text-Spotting based on PSEnet+CRNN Pytorch implementation of an end to end Text-Spotter with a PSEnet text detector and CRNN text recognizer. We

62 Oct 10, 2022

A PyTorch implementation of ECCV2018 Paper: TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes

TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes A PyTorch implement of TextSnake: A Flexible Representation for Detecting

417 Dec 12, 2022

This is a pytorch re-implementation of EAST: An Efficient and Accurate Scene Text Detector.

EAST: An Efficient and Accurate Scene Text Detector Description: This version will be updated soon, please pay attention to this work. The motivation

544 Dec 20, 2022

PyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector

Description This is a PyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector. Only RBOX part is implemented. Using dice loss

365 Dec 20, 2022

This is the official PyTorch implementation of the paper "TransFG: A Transformer Architecture for Fine-grained Recognition" (Ju He, Jie-Neng Chen, Shuai Liu, Adam Kortylewski, Cheng Yang, Yutong Bai, Changhu Wang, Alan Yuille).

TransFG: A Transformer Architecture for Fine-grained Recognition Official PyTorch code for the paper: TransFG: A Transformer Architecture for Fine-gra

307 Jan 3, 2023

CVPR 2021 Oral paper "LED2-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering" official PyTorch implementation.

LED2-Net This is PyTorch implementation of our CVPR 2021 Oral paper "LED2-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering". Y

83 Jan 4, 2023

Official PyTorch implementation for "Mixed supervision for surface-defect detection: from weakly to fully supervised learning"

Mixed supervision for surface-defect detection: from weakly to fully supervised learning [Computers in Industry 2021] Official PyTorch implementation

169 Dec 30, 2022

An official PyTorch implementation of the paper "Learning by Aligning: Visible-Infrared Person Re-identification using Cross-Modal Correspondences", ICCV 2021.

PyTorch implementation of Learning by Aligning (ICCV 2021) This is an official PyTorch implementation of the paper "Learning by Aligning: Visible-Infr

30 Nov 5, 2022

[BMVC'21] Official PyTorch Implementation of Grounded Situation Recognition with Transformers

Grounded Situation Recognition with Transformers Paper | Model Checkpoint This is the official PyTorch implementation of Grounded Situation Recognitio

18 Jul 19, 2022

Open Source Differentiable Computer Vision Library for PyTorch

Kornia is a differentiable computer vision library for PyTorch. It consists of a set of routines and differentiable modules to solve generic computer

7.6k Jan 4, 2023

CRAFT-Pyotorch：Character Region Awareness for Text Detection Reimplementation for Pytorch

CRAFT-Reimplementation Note：If you have any problems, please comment. Or you can join us weChat group. The QR code will update in issues #49 . Reimple

453 Dec 28, 2022

Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector

CRAFT: Character-Region Awareness For Text detection Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector | Paper |

188 Dec 28, 2022

A pure pytorch implemented ocr project including text detection and recognition

ocr.pytorch A pure pytorch implemented ocr project. Text detection is based CTPN and text recognition is based CRNN. More detection and recognition me

444 Dec 30, 2022

Repository collecting all the submodules for the new PyTorch-based OCR System.

OCRopus3 is being replaced by OCRopus4, which is a rewrite using PyTorch 1.7; release should be soonish. Please check github.com/tmbdev/ocropus for up

138 Dec 9, 2022

Kornia is a open source differentiable computer vision library for PyTorch.

Open Source Differentiable Computer Vision Library

7.6k Jan 6, 2023

Unofficial implementation of "TableNet: Deep Learning model for end-to-end Table detection and Tabular data extraction from Scanned Document Images"

TableNet Unofficial implementation of ICDAR 2019 paper : TableNet: Deep Learning model for end-to-end Table detection and Tabular data extraction from

243 Dec 30, 2022

CUTIE (TensorFlow implementation of Convolutional Universal Text Information Extractor)

CUTIE TensorFlow implementation of the paper "CUTIE: Learning to Understand Documents with Convolutional Universal Text Information Extractor." Xiaohu

147 Dec 20, 2022

An Implementation of the alogrithm in paper IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Oriented Scene Text Detection

InceptText-Tensorflow An Implementation of the alogrithm in paper IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Orien

115 Dec 12, 2022