Styled Handwritten Text Generation with Transformers (ICCV 21)

Overview

Handwriting Transformers [PDF]

Ankan Kumar Bhunia, Salman Khan, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan & Mubarak Shah

Abstract: We propose a novel transformer-based styled handwritten text image generation approach, HWT, that strives to learn both style-content entanglement as well as global and local writing style patterns. The proposed HWT captures the long and short range relationships within the style examples through a self-attention mechanism, thereby encoding both global and local style patterns. Further, the proposed transformer-based HWT comprises an encoder-decoder attention that enables style-content entanglement by gathering the style representation of each query character. To the best of our knowledge, we are the first to introduce a transformer-based generative network for styled handwritten text generation. Our proposed HWT generates realistic styled handwritten text images and significantly outperforms the state-of-the-art demonstrated through extensive qualitative, quantitative and human-based evaluations. The proposed HWT can handle arbitrary length of text and any desired writing style in a few-shot setting. Further, our HWT generalizes well to the challenging scenario where both words and writing style are unseen during training, generating realistic styled handwritten text images.

Software environment

  • Python 3.7
  • PyTorch >=1.4

Setup & Training

Please see INSTALL.md for installing required libraries. You can change the content in the file mytext.txt to visualize generated handwriting while training.

Citation

If you use the code for your research, please cite our paper:

@InProceedings{Bhunia_2021_ICCV,
    author    = {Bhunia, Ankan Kumar and Khan, Salman and Cholakkal, Hisham and Anwer, Rao Muhammad and Khan, Fahad Shahbaz and Shah, Mubarak},
    title     = {Handwriting Transformers},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {1086-1094}
}
Comments
  • No such file or directory: '../CVL_32.pickle'

    No such file or directory: '../CVL_32.pickle'

    Thank you for publishing IAM_32.pickle in Issue #4 I cloned this project to my personal Colab Notebook and executed "train.py". Now the file CVL_32.pickle is required.

    Traceback (most recent call last):
      File "./Handwriting-Transformers/train.py", line 42, in <module>
        TextDatasetObjval = TextDatasetval()
      File "/content/Handwriting-Transformers/data/dataset.py", line 109, in __init__
        file_to_store = open(base_path, "rb")
    FileNotFoundError: [Errno 2] No such file or directory: '../CVL_32.pickle'
    

    Would you please publish this file also?

    Thank you very much. 🙇

    opened by kientuongnguyen 5
  • ‘base_path = '../IAM_32.pickle'’

    ‘base_path = '../IAM_32.pickle'’

    Hello, what's in your code ‘base_path = '../IAM_32.pickle'’ 。Can this document be published? If so, please send me a private letter? Thanks for your reading!

    opened by zj916716524 4
  • CVL_32.pickle

    CVL_32.pickle

    Hello, what's in your code ‘base_path = './CVL_32.pickle'’ 。Can this document be published? If so, please send me a private letter? Thanks for your reading

    opened by xuruiying 2
  • About pre-trained model.

    About pre-trained model.

    Sir, the method proposed in your paper has effectively improved. I am very curious how to get the "fid: four scenes" in your paper In addition, Would you like to share the pre-trained model. I am sorry for my recklessness. good luck.

    opened by Fyzjym 2
  • Missing license

    Missing license

    Love this work! Any chance you could add an MIT license or something? I've started my own fork, but wanted to make sure it remains open so people can keep building off of it. Thanks!

    opened by Tahlor 1
  • When is this code complete?

    When is this code complete?

    Hello, thanks for the impressive work. It seems some code(lexicon file and txt not be uploaded) and the install.md not be finished. When will this code be updated? @ankanbhunia

    opened by WeihongM 1
  • Color images

    Color images

    Hello!

    I need to generate color images for my project. I want to try changing your architecture accordingly. I guess this should be easy to do. It is enough to change the number of channels at input and output, right? I'm wondering what you think of this possibility. Any thoughts welcome!

    opened by theotheo 1
  • Generate custom handwriting?

    Generate custom handwriting?

    Could this be used to reproduce one particular handwriting style? (That is, one not in the IAM database -- such as my own handwriting, or the handwriting of a famous historical figure.)

    If so, could you please walk me through how to do that? I would be very appreciative.

    opened by SB2020-eye 2
  • Missing the positional encodings in the encoder

    Missing the positional encodings in the encoder

    Hi, sir, thanks for the impressive work. Mentioned in the paper, "To retain information regarding the order of input sequences being supplied, we add the positional encodings [23] to the input of each attention layer". However, the released code does not add the positional encodings to the Multi-Head Attention of the encoder, and only adds positional encodings to the Multi-Head Attention of the decoder. It's better if we don't apply positional encodings in the encoder?

    opened by dailenson 0
Owner
Ankan Kumar Bhunia
Electrical Engineering, Jadavpur University
Ankan Kumar Bhunia
Official repository for "PAIR: Planning and Iterative Refinement in Pre-trained Transformers for Long Text Generation"

pair-emnlp2020 Official repository for the paper: Xinyu Hua and Lu Wang: PAIR: Planning and Iterative Refinement in Pre-trained Transformers for Long

Xinyu Hua 31 Oct 13, 2022
Extract MNIST handwritten digits dataset binary file into bmp images

MNIST-dataset-extractor Extract MNIST handwritten digits dataset binary file into bmp images More info at http://yann.lecun.com/exdb/mnist/ Dependenci

Omar Mostafa 6 May 24, 2021
Recognize Handwritten Digits using Deep Learning on the browser itself.

MNIST on the Web An attempt to predict MNIST handwritten digits from my PyTorch model from the browser (client-side) and not from the server, with the

Harjyot Bagga 7 May 28, 2022
Official implementation for ICDAR 2021 paper "Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer"

Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer Description Convert offline handwritten mathematical expressi

Wenqi Zhao 87 Dec 27, 2022
A script that trains a model to recognize handwritten digits using the MNIST data set.

handwritten-digits-recognition A script that trains a model to recognize handwritten digits using the MNIST data set. Then it loads external files and

Hamza Sayih 1 Oct 30, 2021
A simple Neural Network that predicts the label for a series of handwritten digits

Neural_Network A simple Neural Network that predicts the label for a series of handwritten numbers This program tries to predict the label (1,2,3 etc.

Ty 1 Dec 18, 2021
Image-generation-baseline - MUGE Text To Image Generation Baseline

MUGE Text To Image Generation Baseline Requirements and Installation More detail

null 23 Oct 17, 2022
BARTScore: Evaluating Generated Text as Text Generation

This is the Repo for the paper: BARTScore: Evaluating Generated Text as Text Generation Updates 2021.06.28 Release online evaluation Demo 2021.06.25 R

NeuLab 196 Dec 17, 2022
A 1.3B text-to-image generation model trained on 14 million image-text pairs

minDALL-E on Conceptual Captions minDALL-E, named after minGPT, is a 1.3B text-to-image generation model trained on 14 million image-text pairs for no

Kakao Brain 604 Dec 14, 2022
[ICCV'21] PlaneTR: Structure-Guided Transformers for 3D Plane Recovery

PlaneTR: Structure-Guided Transformers for 3D Plane Recovery This is the official implementation of our ICCV 2021 paper News There maybe some bugs in

null 73 Nov 30, 2022
[ICCV 2021 Oral] PoinTr: Diverse Point Cloud Completion with Geometry-Aware Transformers

PoinTr: Diverse Point Cloud Completion with Geometry-Aware Transformers Created by Xumin Yu*, Yongming Rao*, Ziyi Wang, Zuyan Liu, Jiwen Lu, Jie Zhou

Xumin Yu 317 Dec 26, 2022
SOTR: Segmenting Objects with Transformers [ICCV 2021]

SOTR: Segmenting Objects with Transformers [ICCV 2021] By Ruohao Guo, Dantong Niu, Liao Qu, Zhenbo Li Introduction This is the official implementation

null 186 Dec 20, 2022
"3D Human Texture Estimation from a Single Image with Transformers", ICCV 2021

Texformer: 3D Human Texture Estimation from a Single Image with Transformers This is the official implementation of "3D Human Texture Estimation from

XiangyuXu 193 Dec 5, 2022
official Pytorch implementation of ICCV 2021 paper FuseFormer: Fusing Fine-Grained Information in Transformers for Video Inpainting.

FuseFormer: Fusing Fine-Grained Information in Transformers for Video Inpainting By Rui Liu, Hanming Deng, Yangyi Huang, Xiaoyu Shi, Lewei Lu, Wenxiu

null 77 Dec 27, 2022
Pytorch implementation for our ICCV 2021 paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering".

TRAnsformer Routing Networks (TRAR) This is an official implementation for ICCV 2021 paper "TRAR: Routing the Attention Spans in Transformers for Visu

Ren Tianhe 49 Nov 10, 2022
This is the official pytorch implementation for our ICCV 2021 paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering" on VQA Task

?? ERASOR (RA-L'21 with ICRA Option) Official page of "ERASOR: Egocentric Ratio of Pseudo Occupancy-based Dynamic Object Removal for Static 3D Point C

Hyungtae Lim 225 Dec 29, 2022
Multivariate Time Series Forecasting with efficient Transformers. Code for the paper "Long-Range Transformers for Dynamic Spatiotemporal Forecasting."

Spacetimeformer Multivariate Forecasting This repository contains the code for the paper, "Long-Range Transformers for Dynamic Spatiotemporal Forecast

QData 440 Jan 2, 2023
Vision-Language Transformer and Query Generation for Referring Segmentation (ICCV 2021)

Vision-Language Transformer and Query Generation for Referring Segmentation Please consider citing our paper in your publications if the project helps

Henghui Ding 143 Dec 23, 2022
Orthogonal Jacobian Regularization for Unsupervised Disentanglement in Image Generation (ICCV 2021)

Orthogonal Jacobian Regularization for Unsupervised Disentanglement in Image Generation Home | PyTorch BigGAN Discovery | TensorFlow ProGAN Regulariza

Yuxiang Wei 54 Dec 30, 2022