Styled Handwritten Text Generation with Transformers (ICCV 21)

Ankan Kumar Bhunia

Last update: Dec 22, 2022

Related tags

Deep Learning transformer gan attention handwriting handwriting-synthesis document-analysis handwriting-generation handwriting-mimicky

Overview

⚡ Handwriting Transformers [PDF]

Ankan Kumar Bhunia, Salman Khan, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan & Mubarak Shah

Abstract: We propose a novel transformer-based styled handwritten text image generation approach, HWT, that strives to learn both style-content entanglement as well as global and local writing style patterns. The proposed HWT captures the long and short range relationships within the style examples through a self-attention mechanism, thereby encoding both global and local style patterns. Further, the proposed transformer-based HWT comprises an encoder-decoder attention that enables style-content entanglement by gathering the style representation of each query character. To the best of our knowledge, we are the first to introduce a transformer-based generative network for styled handwritten text generation. Our proposed HWT generates realistic styled handwritten text images and significantly outperforms the state-of-the-art demonstrated through extensive qualitative, quantitative and human-based evaluations. The proposed HWT can handle arbitrary length of text and any desired writing style in a few-shot setting. Further, our HWT generalizes well to the challenging scenario where both words and writing style are unseen during training, generating realistic styled handwritten text images.

Software environment

Python 3.7
PyTorch >=1.4

Setup & Training

Please see INSTALL.md for installing required libraries. You can change the content in the file mytext.txt to visualize generated handwriting while training.

Citation

If you use the code for your research, please cite our paper:

@InProceedings{Bhunia_2021_ICCV,
    author    = {Bhunia, Ankan Kumar and Khan, Salman and Cholakkal, Hisham and Anwer, Rao Muhammad and Khan, Fahad Shahbaz and Shah, Mubarak},
    title     = {Handwriting Transformers},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {1086-1094}
}

Comments

No such file or directory: '../CVL_32.pickle'

Thank you for publishing IAM_32.pickle in Issue #4 I cloned this project to my personal Colab Notebook and executed "train.py". Now the file CVL_32.pickle is required.

Traceback (most recent call last):
  File "./Handwriting-Transformers/train.py", line 42, in <module>
    TextDatasetObjval = TextDatasetval()
  File "/content/Handwriting-Transformers/data/dataset.py", line 109, in __init__
    file_to_store = open(base_path, "rb")
FileNotFoundError: [Errno 2] No such file or directory: '../CVL_32.pickle'

Would you please publish this file also?

Thank you very much. 🙇

opened by kientuongnguyen 5

‘base_path = '../IAM_32.pickle'’

Hello, what's in your code ‘base_path = '../IAM_32.pickle'’ 。Can this document be published? If so, please send me a private letter? Thanks for your reading！

opened by zj916716524 4
CVL_32.pickle

Hello, what's in your code ‘base_path = './CVL_32.pickle'’ 。Can this document be published? If so, please send me a private letter? Thanks for your reading

opened by xuruiying 2
About pre-trained model.

Sir, the method proposed in your paper has effectively improved. I am very curious how to get the "fid: four scenes" in your paper In addition, Would you like to share the pre-trained model. I am sorry for my recklessness. good luck.

opened by Fyzjym 2
Missing license

Love this work! Any chance you could add an MIT license or something? I've started my own fork, but wanted to make sure it remains open so people can keep building off of it. Thanks!

opened by Tahlor 1
When is this code complete?

Hello, thanks for the impressive work. It seems some code(lexicon file and txt not be uploaded) and the install.md not be finished. When will this code be updated? @ankanbhunia

opened by WeihongM 1
Color images

Hello!

I need to generate color images for my project. I want to try changing your architecture accordingly. I guess this should be easy to do. It is enough to change the number of channels at input and output, right? I'm wondering what you think of this possibility. Any thoughts welcome!

opened by theotheo 1
Generate custom handwriting?

Could this be used to reproduce one particular handwriting style? (That is, one not in the IAM database -- such as my own handwriting, or the handwriting of a famous historical figure.)

If so, could you please walk me through how to do that? I would be very appreciative.

opened by SB2020-eye 2
Missing the positional encodings in the encoder

Hi, sir, thanks for the impressive work. Mentioned in the paper, "To retain information regarding the order of input sequences being supplied, we add the positional encodings [23] to the input of each attention layer". However, the released code does not add the positional encodings to the Multi-Head Attention of the encoder, and only adds positional encodings to the Multi-Head Attention of the decoder. It's better if we don't apply positional encodings in the encoder?

opened by dailenson 0

Owner

Ankan Kumar Bhunia

Electrical Engineering, Jadavpur University

GitHub

Official repository for "PAIR: Planning and Iterative Refinement in Pre-trained Transformers for Long Text Generation"

pair-emnlp2020 Official repository for the paper: Xinyu Hua and Lu Wang: PAIR: Planning and Iterative Refinement in Pre-trained Transformers for Long

31 Oct 13, 2022

Extract MNIST handwritten digits dataset binary file into bmp images

MNIST-dataset-extractor Extract MNIST handwritten digits dataset binary file into bmp images More info at http://yann.lecun.com/exdb/mnist/ Dependenci

6 May 24, 2021

Recognize Handwritten Digits using Deep Learning on the browser itself.

MNIST on the Web An attempt to predict MNIST handwritten digits from my PyTorch model from the browser (client-side) and not from the server, with the

7 May 28, 2022

Official implementation for ICDAR 2021 paper "Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer"

Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer Description Convert offline handwritten mathematical expressi

87 Dec 27, 2022

A script that trains a model to recognize handwritten digits using the MNIST data set.

handwritten-digits-recognition A script that trains a model to recognize handwritten digits using the MNIST data set. Then it loads external files and

1 Oct 30, 2021

A simple Neural Network that predicts the label for a series of handwritten digits

Neural_Network A simple Neural Network that predicts the label for a series of handwritten numbers This program tries to predict the label (1,2,3 etc.

1 Dec 18, 2021

Image-generation-baseline - MUGE Text To Image Generation Baseline

MUGE Text To Image Generation Baseline Requirements and Installation More detail

23 Oct 17, 2022

BARTScore: Evaluating Generated Text as Text Generation

This is the Repo for the paper: BARTScore: Evaluating Generated Text as Text Generation Updates 2021.06.28 Release online evaluation Demo 2021.06.25 R

196 Dec 17, 2022

A 1.3B text-to-image generation model trained on 14 million image-text pairs

minDALL-E on Conceptual Captions minDALL-E, named after minGPT, is a 1.3B text-to-image generation model trained on 14 million image-text pairs for no

604 Dec 14, 2022

[ICCV'21] PlaneTR: Structure-Guided Transformers for 3D Plane Recovery

PlaneTR: Structure-Guided Transformers for 3D Plane Recovery This is the official implementation of our ICCV 2021 paper News There maybe some bugs in

73 Nov 30, 2022

[ICCV 2021 Oral] PoinTr: Diverse Point Cloud Completion with Geometry-Aware Transformers

PoinTr: Diverse Point Cloud Completion with Geometry-Aware Transformers Created by Xumin Yu*, Yongming Rao*, Ziyi Wang, Zuyan Liu, Jiwen Lu, Jie Zhou

317 Dec 26, 2022

SOTR: Segmenting Objects with Transformers [ICCV 2021]

SOTR: Segmenting Objects with Transformers [ICCV 2021] By Ruohao Guo, Dantong Niu, Liao Qu, Zhenbo Li Introduction This is the official implementation

186 Dec 20, 2022

"3D Human Texture Estimation from a Single Image with Transformers", ICCV 2021

Texformer: 3D Human Texture Estimation from a Single Image with Transformers This is the official implementation of "3D Human Texture Estimation from

193 Dec 5, 2022

official Pytorch implementation of ICCV 2021 paper FuseFormer: Fusing Fine-Grained Information in Transformers for Video Inpainting.

FuseFormer: Fusing Fine-Grained Information in Transformers for Video Inpainting By Rui Liu, Hanming Deng, Yangyi Huang, Xiaoyu Shi, Lewei Lu, Wenxiu

77 Dec 27, 2022

Pytorch implementation for our ICCV 2021 paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering".

TRAnsformer Routing Networks (TRAR) This is an official implementation for ICCV 2021 paper "TRAR: Routing the Attention Spans in Transformers for Visu

49 Nov 10, 2022

This is the official pytorch implementation for our ICCV 2021 paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering" on VQA Task

?? ERASOR (RA-L'21 with ICRA Option) Official page of "ERASOR: Egocentric Ratio of Pseudo Occupancy-based Dynamic Object Removal for Static 3D Point C

225 Dec 29, 2022

Multivariate Time Series Forecasting with efficient Transformers. Code for the paper "Long-Range Transformers for Dynamic Spatiotemporal Forecasting."

Spacetimeformer Multivariate Forecasting This repository contains the code for the paper, "Long-Range Transformers for Dynamic Spatiotemporal Forecast

440 Jan 2, 2023

Vision-Language Transformer and Query Generation for Referring Segmentation (ICCV 2021)

Vision-Language Transformer and Query Generation for Referring Segmentation Please consider citing our paper in your publications if the project helps

143 Dec 23, 2022

Orthogonal Jacobian Regularization for Unsupervised Disentanglement in Image Generation (ICCV 2021)

Orthogonal Jacobian Regularization for Unsupervised Disentanglement in Image Generation Home | PyTorch BigGAN Discovery | TensorFlow ProGAN Regulariza

54 Dec 30, 2022