A unofficial pytorch implementation of PAN(PSENet2): Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network

Overview

Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network

Requirements

  • pytorch 1.1+
  • torchvision 0.3+
  • pyclipper
  • opencv3
  • gcc 4.9+

Download

PAN_resnet18_FPEM_FFM and PAN_resnet18_FPEM_FFM on icdar2015:

the updated model(resnet18:78.8,shufflenetv2: 72.4,lr:le-3) is not the best model

google drive

Data Preparation

train: prepare a text in the following format, use '\t' as a separator

/path/to/img.jpg path/to/label.txt
...

val: use a folder

img/ store img
gt/ store gt file

Train

  1. config the train_data_path,val_data_pathin config.json
  2. use following script to run
python3 train.py

Test

eval.py is used to test model on test dataset

  1. config model_path, img_path, gt_path, save_path in eval.py
  2. use following script to test
python3 eval.py

Predict

predict.py is used to inference on single image

  1. config model_path, img_path, in predict.py
  2. use following script to predict
python3 predict.py

The project is still under development.

Performance

ICDAR 2015

only train on ICDAR2015 dataset

Method image size (short size) learning rate Precision (%) Recall (%) F-measure (%) FPS
paper(resnet18) 736 x x x 80.4 26.1
my (ShuffleNetV2+FPEM_FFM+pse扩张) 736 1e-3 81.72 66.73 73.47 24.71 (P100)
my (resnet18+FPEM_FFM+pse扩张) 736 1e-3 84.93 74.09 79.14 21.31 (P100)
my (resnet50+FPEM_FFM+pse扩张) 736 1e-3 84.23 76.12 79.96 14.22 (P100)
my (ShuffleNetV2+FPEM_FFM+pse扩张) 736 1e-4 75.14 57.34 65.04 24.71 (P100)
my (resnet18+FPEM_FFM+pse扩张) 736 1e-4 83.89 69.23 75.86 21.31 (P100)
my (resnet50+FPEM_FFM+pse扩张) 736 1e-4 85.29 75.1 79.87 14.22 (P100)
my (resnet18+FPN+pse扩张) 736 1e-3 76.50 74.70 75.59 14.47 (P100)
my (resnet50+FPN+pse扩张) 736 1e-3 71.82 75.73 73.72 10.67 (P100)
my (resnet18+FPN+pse扩张) 736 1e-4 74.19 72.34 73.25 14.47 (P100)
my (resnet50+FPN+pse扩张) 736 1e-4 78.96 76.27 77.59 10.67 (P100)

examples

todo

  • MobileNet backbone

  • ShuffleNet backbone

reference

  1. https://arxiv.org/pdf/1908.05900.pdf
  2. https://github.com/WenmuZhou/PSENet.pytorch

If this repository helps you,please star it. Thanks.

Comments
  • The loss does not decrease

    The loss does not decrease

    It's a great job.Before your update, I tried to train PAN, but the loss was still high until the end of the training.Does the current version support effective training and inference?

    opened by oysz2016 7
  • some problem about pse.cpp

    some problem about pse.cpp

    谢谢大佬的分享,有一些疑问就是,在predict测试的时候,(model用resnet18的,mac os上测试)提示以下错误,不知能否给些意见? 0.0, best wishes!

    pse.cpp:49:29: error: variable-sized object may not be initialized
            float kernel_vector[label_num][5] = {0};
                                ^~~~~~~~~
    1 error generated.
    make: *** [pse.so] Error 1
    Traceback (most recent call last):
      File "/Users/abelleon/Documents/project/PAN.pytorch-master/predict.py", line 13, in <module>
        from post_processing import decode
      File "/Users/abelleon/Documents/project/PAN.pytorch-master/post_processing/__init__.py", line 17, in <module>
        raise RuntimeError('Cannot compile pse: {}'.format(BASE_DIR))
    RuntimeError: Cannot compile pse: /Users/abelleon/Documents/project/PAN.pytorch-master/post_processing
    
    opened by ssxxx1a 6
  • Exception: ZIP entry not valid

    Exception: ZIP entry not valid

    /utils/cal_recall/rrc_evaluation_funcs.py", line 102, in load_folder_file raise Exception('ZIP entry not valid: %s' % name) Exception: ZIP entry not valid: res_100104531.txt

    验证和eval的时候都报这个错,请问这个是什么意思,我应该怎么解决呢?

    opened by mrwu-mac 2
  • Segmentation fault (core dumped)

    Segmentation fault (core dumped)

    when i run predict.py

    [shakey@xiaoi-778 PAN]$ python predict.py make: Entering directory /opt/shakey/deep-learning/PAN/post_processing' make:pse.so' is up to date. make: Leaving directory `/opt/shakey/deep-learning/PAN/post_processing' self。gpu 1 ininstance True torch.cuda.is True self。gpu 1 ininstance True torch.cuda.is True device: cuda:0 Segmentation fault (core dumped)

    opened by cuimiao187561 1
  • Batch image detection & Exporting the detected boxes

    Batch image detection & Exporting the detected boxes

    @WenmuZhou Thank you for your hard work,

    It would be great if you would add:

    • Script to detect/ inference on multiple images.
    • The option to export the detected boxes, perhaps .txt containing x1,y1,x2,y2,x3,y3,x4,y4
    opened by ghost 1
  • num_samples=0

    num_samples=0

    @WenmuZhou Thank you for your hard work,

    I am trying to train icdar2015, when running train.py I get error. My config.json My training & testing list, and file tree.

    The error message:

    (final) home@home-desktop:~/p2/PAN.pytorch-master$ python train.py
    Traceback (most recent call last):
      File "train.py", line 33, in <module>
        main(config)
      File "train.py", line 18, in main
        train_loader, eval_loader = get_dataloader(config['data_loader']['type'], config['data_loader']['args'])
      File "/home/home/p2/PAN.pytorch-master/data_loader/__init__.py", line 98, in get_dataloader
        num_workers=module_args['loader']['num_workers'])
      File "/home/home/anaconda3/envs/final/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 213, in __init__
        sampler = RandomSampler(dataset)
      File "/home/home/anaconda3/envs/final/lib/python3.6/site-packages/torch/utils/data/sampler.py", line 94, in __init__
        "value, but got num_samples={}".format(self.num_samples))
    ValueError: num_samples should be a positive integer value, but got num_samples=0
    
    opened by ghost 1
  • [疑问]FPEM模块

    [疑问]FPEM模块

    您好,在您的FPEM模块的实现中,有self.add_up和self.add_down, 这两块应该相当于是将不同大小的feature map进行up-sample和down-sample并相加后再进行feature map的融合,而这个融合模块实例应该是每次融合都不一样,而不是每次融合都是同一个模块实例,比如在Up-scale Enhancement中c5和c4down-sample相加后的add_up与c4和c3down-sample相加后的add_up实例应该是不同的两个模块实例,虽然两个add_up结构相同,但是其参数并不能共享,论文中好像也没有强调这些参数是共享的,所以我的理解是每个add_up部分应该创建不同的nn.sequential,down-sample同理。不知道我说的对不对。。。还是说我的理解有偏差 唉,语死早,感觉很难说清楚我要表达的意思

    opened by xiaohuihuichao 0
  • Finetune checkpoint

    Finetune checkpoint

    There is a parameter "finetune_checkpoint", I want to know how does it work here, and does it freeze any initial layers, if yes then how one can control it?

    opened by adijindal30 0
  • post_processing 中的subprocess.call

    post_processing 中的subprocess.call

    File "/Users/liubowen/Downloads/PAN.pytorch-master/post_processing/init.py", line 17, in raise RuntimeError('Cannot compile pse: {}'.format(BASE_DIR)) RuntimeError: Cannot compile pse: /Users/liubowen/Downloads/PAN.pytorch-master/post_processing

    有没有解决方案?please!!

    opened by bowenliu1996 0
Owner
zhoujun
深度学习工程师,最近准备做端侧
zhoujun
The implementation of the paper "A Deep Feature Aggregation Network for Accurate Indoor Camera Localization".

A Deep Feature Aggregation Network for Accurate Indoor Camera Localization This is the PyTorch implementation of our paper "A Deep Feature Aggregation

null 9 Dec 9, 2022
Unofficial pytorch implementation of 'Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization'

pytorch-AdaIN This is an unofficial pytorch implementation of a paper, Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization [Hua

Naoto Inoue 873 Jan 6, 2023
Code for the ICCV 2021 paper "Pixel Difference Networks for Efficient Edge Detection" (Oral).

Pixel Difference Convolution This repository contains the PyTorch implementation for "Pixel Difference Networks for Efficient Edge Detection" by Zhuo

Alex 236 Dec 21, 2022
render sprites into your desktop environment as shaped windows using GTK

spritegtk render static or animated sprites into your desktop environment as dynamic shaped windows using GTK requires pycairo and PYGobject: pip inst

hermit 20 Oct 27, 2022
CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped

CSWin-Transformer This repo is the official implementation of "CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows". Th

Microsoft 409 Jan 6, 2023
An official repository for Paper "Uformer: A General U-Shaped Transformer for Image Restoration".

Uformer: A General U-Shaped Transformer for Image Restoration Zhendong Wang, Xiaodong Cun, Jianmin Bao and Jianzhuang Liu Paper: https://arxiv.org/abs

Zhendong Wang 497 Dec 22, 2022
CFC-Net: A Critical Feature Capturing Network for Arbitrary-Oriented Object Detection in Remote Sensing Images

CFC-Net This project hosts the official implementation for the paper: CFC-Net: A Critical Feature Capturing Network for Arbitrary-Oriented Object Dete

ming71 55 Dec 12, 2022
A pytorch reproduction of { Co-occurrence Feature Learning from Skeleton Data for Action Recognition and Detection with Hierarchical Aggregation }.

A PyTorch Reproduction of HCN Co-occurrence Feature Learning from Skeleton Data for Action Recognition and Detection with Hierarchical Aggregation. Ch

Guyue Hu 210 Dec 31, 2022
Pytorch re-implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition (CVPR 2022)

SwinTextSpotter This is the pytorch implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text R

mxin262 183 Jan 3, 2023
Official implementation of "Dynamic Anchor Learning for Arbitrary-Oriented Object Detection" (AAAI2021).

DAL This project hosts the official implementation for our AAAI 2021 paper: Dynamic Anchor Learning for Arbitrary-Oriented Object Detection [arxiv] [c

ming71 215 Nov 28, 2022
CVPR2021: Temporal Context Aggregation Network for Temporal Action Proposal Refinement

Temporal Context Aggregation Network - Pytorch This repo holds the pytorch-version codes of paper: "Temporal Context Aggregation Network for Temporal

Zhiwu Qing 63 Sep 27, 2022
【ACMMM 2021】DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning

DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning (ACMMM 2021) Overview We release the code of the DSANet (Dynamic S

Wenhao Wu 46 Dec 27, 2022
Container : Context Aggregation Network

Container : Context Aggregation Network If you use this code for a paper please cite: @article{gao2021container, title={Container: Context Aggregati

AI2 47 Dec 16, 2022
PyTorch implementation of paper: AdaAttN: Revisit Attention Mechanism in Arbitrary Neural Style Transfer, ICCV 2021.

AdaAttN: Revisit Attention Mechanism in Arbitrary Neural Style Transfer [Paper] [PyTorch Implementation] [Paddle Implementation] Overview This reposit

null 148 Dec 30, 2022
Pytorch Implementation of Value Retrieval with Arbitrary Queries for Form-like Documents.

Value Retrieval with Arbitrary Queries for Form-like Documents Introduction Pytorch Implementation of Value Retrieval with Arbitrary Queries for Form-

Salesforce 13 Sep 15, 2022
Some code of the implements of Geological Modeling Using 3D Pixel-Adaptive and Deformable Convolutional Neural Network

3D-GMPDCNN Geological Modeling Using 3D Pixel-Adaptive and Deformable Convolutional Neural Network PyTorch implementation of "Geological Modeling Usin

null 5 Nov 21, 2022
Unofficial PyTorch implementation of "RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving" (ECCV 2020)

RTM3D-PyTorch The PyTorch Implementation of the paper: RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving (ECCV 2020

Nguyen Mau Dzung 271 Nov 29, 2022
Face Library is an open source package for accurate and real-time face detection and recognition

Face Library Face Library is an open source package for accurate and real-time face detection and recognition. The package is built over OpenCV and us

null 52 Nov 9, 2022