A unofficial pytorch implementation of PAN(PSENet2): Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network

zhoujun

Last update: Dec 26, 2022

Related tags

Deep Learning PAN.pytorch

Overview

Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network

Requirements

pytorch 1.1+
torchvision 0.3+
pyclipper
opencv3
gcc 4.9+

Download

PAN_resnet18_FPEM_FFM and PAN_resnet18_FPEM_FFM on icdar2015：

the updated model(resnet18:78.8,shufflenetv2: 72.4,lr:le-3) is not the best model

google drive

Data Preparation

train: prepare a text in the following format, use '\t' as a separator

/path/to/img.jpg path/to/label.txt
...

val: use a folder

img/ store img
gt/ store gt file

Train

config the train_data_path,val_data_pathin config.json
use following script to run

python3 train.py

Test

eval.py is used to test model on test dataset

config model_path, img_path, gt_path, save_path in eval.py
use following script to test

python3 eval.py

Predict

predict.py is used to inference on single image

config model_path, img_path, in predict.py
use following script to predict

python3 predict.py

The project is still under development.

Performance

ICDAR 2015

only train on ICDAR2015 dataset

Method	image size (short size)	learning rate	Precision (%)	Recall (%)	F-measure (%)	FPS
paper(resnet18)	736	x	x	x	80.4	26.1
my (ShuffleNetV2+FPEM_FFM+pse扩张)	736	1e-3	81.72	66.73	73.47	24.71 (P100)
my (resnet18+FPEM_FFM+pse扩张)	736	1e-3	84.93	74.09	79.14	21.31 (P100)
my (resnet50+FPEM_FFM+pse扩张)	736	1e-3	84.23	76.12	79.96	14.22 (P100)
my (ShuffleNetV2+FPEM_FFM+pse扩张)	736	1e-4	75.14	57.34	65.04	24.71 (P100)
my (resnet18+FPEM_FFM+pse扩张)	736	1e-4	83.89	69.23	75.86	21.31 (P100)
my (resnet50+FPEM_FFM+pse扩张)	736	1e-4	85.29	75.1	79.87	14.22 (P100)
my (resnet18+FPN+pse扩张)	736	1e-3	76.50	74.70	75.59	14.47 (P100)
my (resnet50+FPN+pse扩张)	736	1e-3	71.82	75.73	73.72	10.67 (P100)
my (resnet18+FPN+pse扩张)	736	1e-4	74.19	72.34	73.25	14.47 (P100)
my (resnet50+FPN+pse扩张)	736	1e-4	78.96	76.27	77.59	10.67 (P100)

examples

todo

MobileNet backbone
ShuffleNet backbone

reference

If this repository helps you，please star it. Thanks.

Comments

The loss does not decrease

It's a great job.Before your update, I tried to train PAN, but the loss was still high until the end of the training.Does the current version support effective training and inference?

opened by oysz2016 7

some problem about pse.cpp

谢谢大佬的分享，有一些疑问就是，在predict测试的时候，（model用resnet18的，mac os上测试）提示以下错误，不知能否给些意见？ 0.0, best wishes!

pse.cpp:49:29: error: variable-sized object may not be initialized
        float kernel_vector[label_num][5] = {0};
                            ^~~~~~~~~
1 error generated.
make: *** [pse.so] Error 1
Traceback (most recent call last):
  File "/Users/abelleon/Documents/project/PAN.pytorch-master/predict.py", line 13, in <module>
    from post_processing import decode
  File "/Users/abelleon/Documents/project/PAN.pytorch-master/post_processing/__init__.py", line 17, in <module>
    raise RuntimeError('Cannot compile pse: {}'.format(BASE_DIR))
RuntimeError: Cannot compile pse: /Users/abelleon/Documents/project/PAN.pytorch-master/post_processing

opened by ssxxx1a 6

Exception: ZIP entry not valid

/utils/cal_recall/rrc_evaluation_funcs.py", line 102, in load_folder_file raise Exception('ZIP entry not valid: %s' % name) Exception: ZIP entry not valid: res_100104531.txt

验证和eval的时候都报这个错，请问这个是什么意思，我应该怎么解决呢？

opened by mrwu-mac 2
Segmentation fault (core dumped)

when i run predict.py

[shakey@xiaoi-778 PAN]$ python predict.py make: Entering directory /opt/shakey/deep-learning/PAN/post_processing' make:pse.so' is up to date. make: Leaving directory `/opt/shakey/deep-learning/PAN/post_processing' self。gpu 1 ininstance True torch.cuda.is True self。gpu 1 ininstance True torch.cuda.is True device: cuda:0 Segmentation fault (core dumped)

opened by cuimiao187561 1
Batch image detection & Exporting the detected boxes
@WenmuZhou Thank you for your hard work,

It would be great if you would add:

Script to detect/ inference on multiple images.

The option to export the detected boxes, perhaps .txt containing x1,y1,x2,y2,x3,y3,x4,y4
opened by ghost 1

num_samples=0

@WenmuZhou Thank you for your hard work,

I am trying to train icdar2015, when running train.py I get error. My config.json My training & testing list, and file tree.

The error message:

(final) home@home-desktop:~/p2/PAN.pytorch-master$ python train.py
Traceback (most recent call last):
  File "train.py", line 33, in <module>
    main(config)
  File "train.py", line 18, in main
    train_loader, eval_loader = get_dataloader(config['data_loader']['type'], config['data_loader']['args'])
  File "/home/home/p2/PAN.pytorch-master/data_loader/__init__.py", line 98, in get_dataloader
    num_workers=module_args['loader']['num_workers'])
  File "/home/home/anaconda3/envs/final/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 213, in __init__
    sampler = RandomSampler(dataset)
  File "/home/home/anaconda3/envs/final/lib/python3.6/site-packages/torch/utils/data/sampler.py", line 94, in __init__
    "value, but got num_samples={}".format(self.num_samples))
ValueError: num_samples should be a positive integer value, but got num_samples=0

opened by ghost 1

[疑问]FPEM模块

您好，在您的FPEM模块的实现中,有self.add_up和self.add_down，这两块应该相当于是将不同大小的feature map进行up-sample和down-sample并相加后再进行feature map的融合，而这个融合模块实例应该是每次融合都不一样，而不是每次融合都是同一个模块实例，比如在Up-scale Enhancement中c5和c4down-sample相加后的add_up与c4和c3down-sample相加后的add_up实例应该是不同的两个模块实例，虽然两个add_up结构相同，但是其参数并不能共享，论文中好像也没有强调这些参数是共享的，所以我的理解是每个add_up部分应该创建不同的nn.sequential，down-sample同理。不知道我说的对不对。。。还是说我的理解有偏差唉，语死早，感觉很难说清楚我要表达的意思

opened by xiaohuihuichao 0
Finetune checkpoint

There is a parameter "finetune_checkpoint", I want to know how does it work here, and does it freeze any initial layers, if yes then how one can control it?

opened by adijindal30 0
post_processing 中的subprocess.call

File "/Users/liubowen/Downloads/PAN.pytorch-master/post_processing/init.py", line 17, in raise RuntimeError('Cannot compile pse: {}'.format(BASE_DIR)) RuntimeError: Cannot compile pse: /Users/liubowen/Downloads/PAN.pytorch-master/post_processing

有没有解决方案？please！！

opened by bowenliu1996 0

Owner

zhoujun

深度学习工程师，最近准备做端侧

GitHub

The implementation of the paper "A Deep Feature Aggregation Network for Accurate Indoor Camera Localization".

A Deep Feature Aggregation Network for Accurate Indoor Camera Localization This is the PyTorch implementation of our paper "A Deep Feature Aggregation

9 Dec 9, 2022

Unofficial pytorch implementation of 'Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization'

pytorch-AdaIN This is an unofficial pytorch implementation of a paper, Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization [Hua

873 Jan 6, 2023

Code for the ICCV 2021 paper "Pixel Difference Networks for Efficient Edge Detection" (Oral).

Pixel Difference Convolution This repository contains the PyTorch implementation for "Pixel Difference Networks for Efficient Edge Detection" by Zhuo

236 Dec 21, 2022

This is the unofficial code of Deep Dual-resolution Networks for Real-time and Accurate Semantic Segmentation of Road Scenes. which achieve state-of-the-art trade-off between accuracy and speed on cityscapes and camvid, without using inference acceleration and extra data

Deep Dual-resolution Networks for Real-time and Accurate Semantic Segmentation of Road Scenes Introduction This is the unofficial code of Deep Dual-re

113 Dec 23, 2022

render sprites into your desktop environment as shaped windows using GTK

spritegtk render static or animated sprites into your desktop environment as dynamic shaped windows using GTK requires pycairo and PYGobject: pip inst

20 Oct 27, 2022

CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped

CSWin-Transformer This repo is the official implementation of "CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows". Th

409 Jan 6, 2023

An official repository for Paper "Uformer: A General U-Shaped Transformer for Image Restoration".

Uformer: A General U-Shaped Transformer for Image Restoration Zhendong Wang, Xiaodong Cun, Jianmin Bao and Jianzhuang Liu Paper: https://arxiv.org/abs

497 Dec 22, 2022

CFC-Net: A Critical Feature Capturing Network for Arbitrary-Oriented Object Detection in Remote Sensing Images

CFC-Net This project hosts the official implementation for the paper: CFC-Net: A Critical Feature Capturing Network for Arbitrary-Oriented Object Dete

55 Dec 12, 2022

A pytorch reproduction of { Co-occurrence Feature Learning from Skeleton Data for Action Recognition and Detection with Hierarchical Aggregation }.

A PyTorch Reproduction of HCN Co-occurrence Feature Learning from Skeleton Data for Action Recognition and Detection with Hierarchical Aggregation. Ch

210 Dec 31, 2022

Pytorch re-implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition (CVPR 2022)

SwinTextSpotter This is the pytorch implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text R

183 Jan 3, 2023

Official implementation of "Dynamic Anchor Learning for Arbitrary-Oriented Object Detection" (AAAI2021).

DAL This project hosts the official implementation for our AAAI 2021 paper: Dynamic Anchor Learning for Arbitrary-Oriented Object Detection [arxiv] [c

215 Nov 28, 2022

CVPR2021: Temporal Context Aggregation Network for Temporal Action Proposal Refinement

Temporal Context Aggregation Network - Pytorch This repo holds the pytorch-version codes of paper: "Temporal Context Aggregation Network for Temporal

63 Sep 27, 2022

【ACMMM 2021】DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning

DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning (ACMMM 2021) Overview We release the code of the DSANet (Dynamic S

46 Dec 27, 2022

Container : Context Aggregation Network

Container : Context Aggregation Network If you use this code for a paper please cite: @article{gao2021container, title={Container: Context Aggregati

47 Dec 16, 2022

PyTorch implementation of paper: AdaAttN: Revisit Attention Mechanism in Arbitrary Neural Style Transfer, ICCV 2021.

AdaAttN: Revisit Attention Mechanism in Arbitrary Neural Style Transfer [Paper] [PyTorch Implementation] [Paddle Implementation] Overview This reposit

148 Dec 30, 2022

Pytorch Implementation of Value Retrieval with Arbitrary Queries for Form-like Documents.

Value Retrieval with Arbitrary Queries for Form-like Documents Introduction Pytorch Implementation of Value Retrieval with Arbitrary Queries for Form-

13 Sep 15, 2022

Some code of the implements of Geological Modeling Using 3D Pixel-Adaptive and Deformable Convolutional Neural Network

3D-GMPDCNN Geological Modeling Using 3D Pixel-Adaptive and Deformable Convolutional Neural Network PyTorch implementation of "Geological Modeling Usin

5 Nov 21, 2022

Unofficial PyTorch implementation of "RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving" (ECCV 2020)

RTM3D-PyTorch The PyTorch Implementation of the paper: RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving (ECCV 2020

271 Nov 29, 2022

Face Library is an open source package for accurate and real-time face detection and recognition

Face Library Face Library is an open source package for accurate and real-time face detection and recognition. The package is built over OpenCV and us

52 Nov 9, 2022