Tool which allow you to detect and translate text.

Damian Panek

Last update: Nov 28, 2022

Related tags

Text Data & NLP nlp recognition deep-learning text craft pytorch text-recognition text-processing ocr-recognition crnn scene-text-detection scene-text-detectors

Overview

Text detection and recognition

This repository contains tool which allow to detect region with text and translate it one by one.

Description

Two pretrained neural networks are used. One of them is responsible for detecting places in which text appear and return its coordinates. Structure use for this operation is based on CRAFT architecture.

Craft Paper

Second network take detected words and recognize words included inside it. Convolutional Recurrential neural networks (CRNN) are used for this operation.

CRNN Paper

Example

Under construction

Deployment

I decided to deploy it on heroku (temporarily solution), but the amount of memory available on this platform is not enough. You can check it on heroku app. I decided to add bootstrap template because whole solution become more intuitive.

Windows Installation

To install it locally, you can run from your virtual env

python -m pip install requirements.txt

Linux installation

to install it properly on Linux OS you have to install additionaly


apt-get update
apt-get install -y libsm6 libxext6 libxrender-dev
pip install opencv-python

If problems with cv2 imports are still appearing then you should install

pip install opencv-contrib-python

Then you can run

```python
python -m pip install requirements.txt

Run

To run it locally, please activate your environment

> win
venv\Scripts\activate.bat

>linux
source venv\Scripts\activate

and run straight from project origin

python  app.py

If everything goes properly, you'll see on localhost:8000, screen just like one below.

Updates

I decided to remove argparse, because as I mention earlier, it was less intuitive. Solution is not fast, is more like an toy example which shows how to use Pytorch model on deployment environment.

Version which I use here contain torch-cpu which make preprocessing and detecting slightly slower. I test it on cuda and it was much faster.

If you have more information, drop me a line If you like it, give a star

Draft: Show how does it work on complex .tif example document.

Contact Info

Comments

failing with Could not find a version that satisfies the requirement cv2

`eliuha@machine:~/dev/lpr/pytorch-text-recognition$ python app.py --input_file test_image.jpeg /home/eliuha/anaconda3/lib/python3.6/site-packages/matplotlib/font_manager.py:279: UserWarning: Matplotlib is building the font cache using fc-list. This may take a moment. 'Matplotlib is building the font cache using fc-list. ' Traceback (most recent call last): File "app.py", line 19, in import text_reco.models.craft.craft_utils as craft_utils File "/home/eliuha/dev/lpr/pytorch-text-recognition/text_reco/models/craft/craft_utils.py", line 6, in import cv2 ModuleNotFoundError: No module named 'cv2'

opened by eliuha 4
LinkRefiner

@s3nh clovaai just released the LinkRefiner code clovaai/CRAFT-pytorch@3cd65f5 better you implement it, along with option to train . So that we can detect text-lines

opened by ghost 0
Training on custom data??

Hi, I'd like to know if this repo can be used to train CRAFT on custom data using weakly supervised learning ? I have images with a lot of text on them. All of the text has been tagged but at a sentence level, i.e., I have the bounding box information and ground truth label information but at a sentence level. Can weakly supervised learning be used in this case ?

opened by dhruvsharma717 0
CRAFT Model memory usage

Seems your code mostly inspired from: https://github.com/clovaai/CRAFT-pytorch

I was using the same repo for text detection model and saw cpu memory usage of CRAFT model. It increases after each image. Were you able to detect the same problem.

opened by p9anand 0
ValueError: tile cannot extend outside image

Hi, Thanks for this amazing work. I was using this repo for recognition task but when i run the app.py ,upload the image and click on get your results it gives ValueError: tile cannot extend outside image.

opened by mahendra047 1

Owner

Damian Panek

I am trying to do something useful.

GitHub

Translate U is capable of translating the text present in an image from one language to the other.

Translate U is capable of translating the text present in an image from one language to the other. The app uses OCR and Google translate to identify and translate across 80+ languages.

1 Dec 22, 2021

Input english text, then translate it between languages n times using the Deep Translator Python Library.

mass-translator About Input english text, then translate it between languages n times using the Deep Translator Python Library. How to Use Install dep

2 Mar 4, 2022

Simple Python script to scrape youtube channles of "Parity Technologies and Web3 Foundation" and translate them to well-known braille language or any language

Simple Python script to scrape youtube channles of "Parity Technologies and Web3 Foundation" and translate them to well-known braille language or any

1 Apr 28, 2022

Takes a string and puts it through different languages in Google Translate a requested amount of times, returning nonsense.

PythonTextObfuscator Takes a string and puts it through different languages in Google Translate a requested amount of times, returning nonsense. Requi

2 Aug 29, 2022

Translate - a PyTorch Language Library

NOTE PyTorch Translate is now deprecated, please use fairseq instead. Translate - a PyTorch Language Library Translate is a library for machine transl

775 Dec 24, 2022

Translate - a PyTorch Language Library

NOTE PyTorch Translate is now deprecated, please use fairseq instead. Translate - a PyTorch Language Library Translate is a library for machine transl

678 Feb 15, 2021

Auto translate textbox from Japanese to English or Indonesia

priconne-auto-translate Auto translate textbox from Japanese to English or Indonesia How to use Install python first, Anaconda is recommended Install

5 Aug 25, 2022

translate using your voice

speech-to-text-translator Usage translate using your voice description this project makes translating a word easy, all you have to do is speak and...

1 Oct 18, 2021

translate using your voice

speech-to-text-translator Usage translate using your voice description this project makes translating a word easy, all you have to do is speak and...

1 Oct 18, 2021

This program do translate english words to portuguese

Python-Dictionary This program is used to translate english words to portuguese. Web-Scraping This program use BeautifulSoap to make web scraping, so

1 Oct 10, 2022

A telegram bot to translate 100+ Languages

?? GOOGLE TRANSLATER ?? The owner would not be responsible for any kind of bans due to the bot. • ⚡ INSTALLING ⚡ • • ?? Deploy To Railway ?? • • ✅ OFF

5 Dec 20, 2021

Graphical user interface for Argos Translate

Argos Translate GUI Website | GitHub | PyPI Graphical user interface for Argos Translate. Install pip3 install argostranslategui

16 Dec 7, 2022

MicBot - MicBot uses Google Translate to speak everyone's chat messages

MicBot MicBot uses Google Translate to speak everyone's chat messages. It can al

2 Mar 9, 2022

Use the state-of-the-art m2m100 to translate large data on CPU/GPU/TPU. Super Easy!

Easy-Translate is a script for translating large text files in your machine using the M2M100 models from Facebook/Meta AI. We also privide a script fo

41 Dec 15, 2022

Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.

Pytorch-NLU，一个中文文本分类、序列标注工具包，支持中文长文本、短文本的多类、多标签分类任务，支持中文命名实体识别、词性标注、分词等序列标注任务。 Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.

186 Dec 24, 2022

Text-Summarization-using-NLP - Text Summarization using NLP to fetch BBC News Article and summarize its text and also it includes custom article Summarization

Text-Summarization-using-NLP Text Summarization using NLP to fetch BBC News Arti

21 Aug 6, 2022

Fake news detector filters - Smart filter project allow to classify the quality of information and web pages

fake-news-detector-1.0 Lists, lists and more lists... Spam filter list, quality keyword list, stoplist list, top-domains urls list, news agencies webs

1 Jan 4, 2022

Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipgrams (i.e patterns with one or more gaps, either of fixed or dynamic size) in a quick and memory-efficient way. At the core is the tool ``colibri-patternmodeller`` whi ch allows you to build, view, manipulate and query pattern models.

Colibri Core by Maarten van Gompel, [email protected], Radboud University Nijmegen Licensed under GPLv3 (See http://www.gnu.org/licenses/gpl-3.0.html

122 Nov 17, 2022

Arabic-Phonetic-Output - You can input the phonetic version of any Arabic text here. This software will show you output in Arabic (with vowels)

Arabic-Phonetic-Output You can input the phonetic version of any Arabic text her

1 Dec 30, 2021

Tool which allow you to detect and translate text.

Related tags

Overview

Text detection and recognition

Description

Example

Deployment

Windows Installation

Linux installation

Run

Updates

Comments

failing with Could not find a version that satisfies the requirement cv2

LinkRefiner

Training on custom data??

CRAFT Model memory usage

ValueError: tile cannot extend outside image

Owner

Damian Panek

Translate U is capable of translating the text present in an image from one language to the other.

Input english text, then translate it between languages n times using the Deep Translator Python Library.

Simple Python script to scrape youtube channles of "Parity Technologies and Web3 Foundation" and translate them to well-known braille language or any language

Takes a string and puts it through different languages in Google Translate a requested amount of times, returning nonsense.

Translate - a PyTorch Language Library

Translate - a PyTorch Language Library

Auto translate textbox from Japanese to English or Indonesia

translate using your voice

translate using your voice

This program do translate english words to portuguese

A telegram bot to translate 100+ Languages

Graphical user interface for Argos Translate

MicBot - MicBot uses Google Translate to speak everyone's chat messages

Use the state-of-the-art m2m100 to translate large data on CPU/GPU/TPU. Super Easy!

Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.

Text-Summarization-using-NLP - Text Summarization using NLP to fetch BBC News Article and summarize its text and also it includes custom article Summarization

Fake news detector filters - Smart filter project allow to classify the quality of information and web pages

Arabic-Phonetic-Output - You can input the phonetic version of any Arabic text here. This software will show you output in Arabic (with vowels)