FOTS Pytorch Implementation

Overview

News!!! Recognition branch now is added into model. The whole project has beed optimized and refactored.

  • ICDAR Dataset
  • SynthText 800K Dataset
  • detection branch (verified on the training set, It works!)
  • recognition branch
  • eval
  • multi-gpu training
  • reasonable project structure
  • wandb
  • pytorch_lightning

Introduction

This is a PyTorch implementation of FOTS.

Instruction

Requirements

  1. build tools

    ./build.sh
    
  2. prepare Dataset

  3. create virtual env, you may need conda

conda create --name fots --file spec-file.txt
conda activate fots
pip install -r reqs.txt

Training

# quite easy, for single gpu training set gpus to [0]. 0 is the id of your gpu.
python train.py -c pretrain.json
python train.py -c finetune.json

Evaluation

python eval.py -m <model.tar.gz> -i <input_images_folder> -o <output_folders>

Benchmarking and Models (Coming soon!)

Acknowledgement

Comments
  • RuntimeError: Cannot compile lanms??

    RuntimeError: Cannot compile lanms??

    make: 进入目录“/home/rwd/graduate/code/fots.PyTorch-master/utils/lanms” g++ -o adaptor.so -I include -std=c++11 -O3 -I/home/rwd/anaconda2/envs/py351/include/python3.5m -I/home/rwd/anaconda2/envs/py351/include/python3.5m -Wno-unused-result -Wsign-compare -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O3 -pipe -fdebug-prefix-map==/usr/local/src/conda/- -fdebug-prefix-map==/usr/local/src/conda-prefix -fuse-linker-plugin -ffat-lto-objects -flto-partition=none -flto -DNDEBUG -fwrapv -O3 -Wall -Wstrict-prototypes -L/home/rwd/anaconda2/envs/py351/lib/python3.5/config-3.5m -L/home/rwd/anaconda2/envs/py351/lib -lpython3.5m -lpthread -ldl -lutil -lrt -lm -Xlinker -export-dynamic adaptor.cpp include/clipper/clipper.cpp --shared -fPIC g++: 错误:unrecognized command line option ‘-fno-plt’ make: *** [adaptor.so] 错误 1 make: 离开目录“/home/rwd/graduate/code/fots.PyTorch-master/utils/lanms” Traceback (most recent call last): File "train.py", line 14, in from trainer import Trainer File "/home/rwd/graduate/code/fots.PyTorch-master/trainer/init.py", line 1, in from .trainer import * File "/home/rwd/graduate/code/fots.PyTorch-master/trainer/trainer.py", line 4, in from utils.bbox import Toolbox File "/home/rwd/graduate/code/fots.PyTorch-master/utils/bbox.py", line 8, in from . import lanms File "/home/rwd/graduate/code/fots.PyTorch-master/utils/lanms/init.py", line 8, in raise RuntimeError('Cannot compile lanms: {}'.format(BASE_DIR)) RuntimeError: Cannot compile lanms: /home/rwd/graduate/code/fots.PyTorch-master/utils/lanms

    opened by monologue1107 15
  • Loss: Nan

    Loss: Nan

    用的是SynthText数据集,训练时出现:

    RankWarning: Polyfit may be poorly conditioned 
    edge = fit_line([p0[0], p1[0]], [p0[1], p1[1]])
    

    上面的警告不影响训练,训练一段时间后loss出现Nan值。我并没有执行build.sh这个程序,因为学校集群环境中没有NVCC。看其他问题中提到好像也不需要编译它。 希望问题能得到解决。

    opened by MiTeng0215 8
  • can't save the model

    can't save the model

    Hi, many thanks to your great contribution! When I finished training and save the model using "torch.save()", I get the follow problem: [can't pickle _thread.RLock objects] image I use python3.6 and torch 1.0.0, it seems that something wrong with the model structure? Do you have any suggestion? thanks for your reply!

    opened by SherryShall 2
  • 请教一下ROIrotate部分的代码

    请教一下ROIrotate部分的代码

    不是很明白下面这个函数的作用,可以简单讲解一下吗? def param2theta(param, w, h): param = np.vstack([param, [0, 0, 1]]) param = np.linalg.inv(param)

        theta = np.zeros([2, 3])
        theta[0, 0] = param[0, 0]
        theta[0, 1] = param[0, 1] * h / w
        theta[0, 2] = param[0, 2] * 2 / w + theta[0, 0] + theta[0, 1] - 1
        theta[1, 0] = param[1, 0] * w / h
        theta[1, 1] = param[1, 1]
        theta[1, 2] = param[1, 2] * 2 / h + theta[1, 0] + theta[1, 1] - 1
        return theta
    
    opened by Fighting-JJ 2
  • How to understand the Eqn. (4-8) in the FOTS paper

    How to understand the Eqn. (4-8) in the FOTS paper

    Hi @jiangxiluning thank you for sharing the code. I have recently read the paper but I cannot understand the Eqn. (4-8) in the FOTS paper. The equations are as follows:

    fd0b42fb-ff9c-4b30-b3f7-9d408fff68e6

    From them, to compute the M matrix, we need to compute tx, ty, which are both related to the x,y position (from Eqn 4 and 5). So does that mean there is one M matrix for each point (x,y) in the input feature map? But as far as I know, for one bounding box, we just need one transformation matrix, right? Do I miss something or understand uncorrectly? Looking forward to your reply, thanks a lot!

    opened by Remember2018 2
  • Documentation on preparing and training

    Documentation on preparing and training

    @jiangxiluning thank you for your hard work.

    Can you share some documentation on:

    • Preparing the training data.
    • How to train
    • How to use the model.

    Thank you waiting for your reply

    opened by ghost 2
  • Exception[pytorch_lightning]: You are trying to `self.log()` but it is not managed by the `Trainer` control flow

    Exception[pytorch_lightning]: You are trying to `self.log()` but it is not managed by the `Trainer` control flow

    Hi, I've met a crash while running training model. There is a problem from pytorch_lightning, maybe. But I don't understand why 'self.log' cannot be used. Please help me to solve the problem.

    I ran the project on the colab. The version of the pytorch_lightening is 1.5.

    Traceback (most recent call last): File "train.py", line 99, in main(config, args.resume) File "train.py", line 75, in main trainer.fit(model=model, datamodule=data_module) File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 769, in fit self._fit_impl, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 719, in _call_and_handle_interrupt return self.strategy.launcher.launch(trainer_fn, *args, trainer=self, **kwargs) File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/strategies/launchers/subprocess_script.py", line 93, in launch return function(*args, **kwargs) File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 809, in _fit_impl results = self._run(model, ckpt_path=self.ckpt_path) File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 1234, in _run results = self._run_stage() File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 1321, in _run_stage return self._run_train() File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 1351, in _run_train self.fit_loop.run() File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/loops/base.py", line 204, in run self.advance(*args, **kwargs) File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/loops/fit_loop.py", line 268, in advance self._outputs = self.epoch_loop.run(self._data_fetcher) File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/loops/base.py", line 204, in run self.advance(*args, **kwargs) File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 246, in advance self.trainer._logger_connector.update_train_step_metrics() File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/connectors/logger_connector/logger_connector.py", line 197, in update_train_step_metrics self._log_gpus_metrics() File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/connectors/logger_connector/logger_connector.py", line 226, in _log_gpus_metrics key, mem, prog_bar=False, logger=True, on_step=True, on_epoch=False File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/core/lightning.py", line 386, in log "You are trying to self.log() but it is not managed by the Trainer control flow" pytorch_lightning.utilities.exceptions.MisconfigurationException: You are trying to self.log() but it is not managed by the Trainer control flow

    opened by JANGSOONMYUN 1
  • fatal error: torch/extension.h: No such file or directory

    fatal error: torch/extension.h: No such file or directory

    When I run bash build.sh, I get

    rroi_align_kernel.cu:1:10: fatal error: torch/extension.h: No such file or directory
        1 | #include <torch/extension.h>
    

    How to resolve this problem?

    opened by limaries30 1
  • training with recall :0;hmean:0; precious:0

    training with recall :0;hmean:0; precious:0

    epoch          : 1
    loss           : 0.03228356248388688
    precious       : 0.0
    recall         : 0.0
    hmean          : 0.0
    val_precious   : 0.0
    val_recall     : 0.0
    val_hmean      : 0.0
    

    can someone help me

    opened by suven2019 1
  • How i can get recognition result ?

    How i can get recognition result ?

    I'm training and checking the results. Drawing a box on an image file is going well. I want to check the recognition result of the text, which code are checking the result?

    opened by freegear 1
  • ModuleNotFoundError: No module named 'pretrainedmodels'

    ModuleNotFoundError: No module named 'pretrainedmodels'

    python eval.py Traceback (most recent call last): File "eval.py", line 10, in from model.model import FOTSModel File "/Users/zakj/Desktop/work/detection_fots/model/model.py", line 6, in import pretrainedmodels as pm ModuleNotFoundError: No module named 'pretrainedmodels'

    opened by happog 1
  • error while

    error while "pip install -r reqs.txt"

    ERROR: Could not install packages due to an OSError: [Errno 2] No such file or directory: '/home/conda/feedstock_root/build_artifacts/absl-py_1615404881292/work'

    absl-py @ file:///home/conda/feedstock_root/build_artifacts/absl-py_1615404881292/work aiohttp==3.7.4.post0 async-timeout==3.0.1 attrs @ file:///tmp/build/80754af9/attrs_1604765588209/work

    beacause of the "@ file:///home/conda/feedstock_root/build_artifacts/absl-py_1615404881292/work"?

    opened by lingfengqiu 0
  • Compile successed but failed at import

    Compile successed but failed at import

    ImportError: /opt/conda/lib/python3.6/site-packages/rotate_roi-0.0.0-py3.6-linux-x86_64.egg/rotated_roi.cpython-36m-x86_64-linux-gnu.so: undefined symbol: RROIAlignForwardLaucher

    env:

    `[07/26 03:17:38 LP]: Environment info:


    sys.platform linux Python 3.6.9 |Anaconda, Inc.| (default, Jul 30 2019, 19:07:31) [GCC 7.3.0] numpy 1.18.1 PyTorch 1.6.0 @/opt/conda/lib/python3.6/site-packages/torch PyTorch debug build False GPU available True GPU 0,1,2 GeForce RTX 2080 Ti CUDA_HOME /usr/local/cuda TORCH_CUDA_ARCH_LIST 5.2 6.0 6.1 7.0 7.5+PTX Pillow 8.4.0 torchvision 0.7.0 @/opt/conda/lib/python3.6/site-packages/torchvision torchvision arch flags sm_35, sm_50, sm_60, sm_70, sm_75 cv2 3.4.1


    PyTorch built with:

    • GCC 7.3
    • C++ Version: 201402
    • Intel(R) Math Kernel Library Version 2019.0.5 Product Build 20190808 for Intel(R) 64 architecture applications
    • Intel(R) MKL-DNN v1.5.0 (Git Hash e2ac1fac44c5078ca927cb9b90e1b3066a0b2ed0)
    • OpenMP 201511 (a.k.a. OpenMP 4.5)
    • NNPACK is enabled
    • CPU capability usage: AVX2
    • CUDA Runtime 10.2
    • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75
    • CuDNN 7.6.5
    • Magma 2.5.2
    • Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_VULKAN_WRAPPER -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF, `
    opened by tycallen 0
Owner
Ning Lu
CV and NLP 长期招聘 OCR 方向实习生,欢迎来撩。
Ning Lu
TensorFlow Implementation of FOTS, Fast Oriented Text Spotting with a Unified Network.

FOTS: Fast Oriented Text Spotting with a Unified Network I am still working on this repo. updates and detailed instructions are coming soon! Table of

Masao Taketani 52 Nov 11, 2022
Pytorch implementation of PSEnet with Pyramid Attention Network as feature extractor

Scene Text-Spotting based on PSEnet+CRNN Pytorch implementation of an end to end Text-Spotter with a PSEnet text detector and CRNN text recognizer. We

azhar shaikh 62 Oct 10, 2022
A PyTorch implementation of ECCV2018 Paper: TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes

TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes A PyTorch implement of TextSnake: A Flexible Representation for Detecting

Prince Wang 417 Dec 12, 2022
This is a pytorch re-implementation of EAST: An Efficient and Accurate Scene Text Detector.

EAST: An Efficient and Accurate Scene Text Detector Description: This version will be updated soon, please pay attention to this work. The motivation

Dejia Song 544 Dec 20, 2022
PyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector

Description This is a PyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector. Only RBOX part is implemented. Using dice loss

null 365 Dec 20, 2022
This is the official PyTorch implementation of the paper "TransFG: A Transformer Architecture for Fine-grained Recognition" (Ju He, Jie-Neng Chen, Shuai Liu, Adam Kortylewski, Cheng Yang, Yutong Bai, Changhu Wang, Alan Yuille).

TransFG: A Transformer Architecture for Fine-grained Recognition Official PyTorch code for the paper: TransFG: A Transformer Architecture for Fine-gra

Ju He 307 Jan 3, 2023
CVPR 2021 Oral paper "LED2-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering" official PyTorch implementation.

LED2-Net This is PyTorch implementation of our CVPR 2021 Oral paper "LED2-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering". Y

Fu-En Wang 83 Jan 4, 2023
Official PyTorch implementation for "Mixed supervision for surface-defect detection: from weakly to fully supervised learning"

Mixed supervision for surface-defect detection: from weakly to fully supervised learning [Computers in Industry 2021] Official PyTorch implementation

ViCoS Lab 169 Dec 30, 2022
An official PyTorch implementation of the paper "Learning by Aligning: Visible-Infrared Person Re-identification using Cross-Modal Correspondences", ICCV 2021.

PyTorch implementation of Learning by Aligning (ICCV 2021) This is an official PyTorch implementation of the paper "Learning by Aligning: Visible-Infr

CV Lab @ Yonsei University 30 Nov 5, 2022
[BMVC'21] Official PyTorch Implementation of Grounded Situation Recognition with Transformers

Grounded Situation Recognition with Transformers Paper | Model Checkpoint This is the official PyTorch implementation of Grounded Situation Recognitio

Junhyeong Cho 18 Jul 19, 2022
Open Source Differentiable Computer Vision Library for PyTorch

Kornia is a differentiable computer vision library for PyTorch. It consists of a set of routines and differentiable modules to solve generic computer

kornia 7.6k Jan 4, 2023
CRAFT-Pyotorch:Character Region Awareness for Text Detection Reimplementation for Pytorch

CRAFT-Reimplementation Note:If you have any problems, please comment. Or you can join us weChat group. The QR code will update in issues #49 . Reimple

null 453 Dec 28, 2022
Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector

CRAFT: Character-Region Awareness For Text detection Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector | Paper |

null 188 Dec 28, 2022
A pure pytorch implemented ocr project including text detection and recognition

ocr.pytorch A pure pytorch implemented ocr project. Text detection is based CTPN and text recognition is based CRNN. More detection and recognition me

coura 444 Dec 30, 2022
Repository collecting all the submodules for the new PyTorch-based OCR System.

OCRopus3 is being replaced by OCRopus4, which is a rewrite using PyTorch 1.7; release should be soonish. Please check github.com/tmbdev/ocropus for up

NVIDIA Research Projects 138 Dec 9, 2022
Kornia is a open source differentiable computer vision library for PyTorch.

Open Source Differentiable Computer Vision Library

kornia 7.6k Jan 6, 2023
Unofficial implementation of "TableNet: Deep Learning model for end-to-end Table detection and Tabular data extraction from Scanned Document Images"

TableNet Unofficial implementation of ICDAR 2019 paper : TableNet: Deep Learning model for end-to-end Table detection and Tabular data extraction from

Jainam Shah 243 Dec 30, 2022
CUTIE (TensorFlow implementation of Convolutional Universal Text Information Extractor)

CUTIE TensorFlow implementation of the paper "CUTIE: Learning to Understand Documents with Convolutional Universal Text Information Extractor." Xiaohu

Zhao,Xiaohui 147 Dec 20, 2022
An Implementation of the alogrithm in paper IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Oriented Scene Text Detection

InceptText-Tensorflow An Implementation of the alogrithm in paper IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Orien

GeorgeJoe 115 Dec 12, 2022