Responsive Doc. scanner using U^2-Net, Textcleaner and Tesseract

Last update: Jul 13, 2022

Related tags

Computer Vision u2netscan

Overview

Responsive Doc. scanner using U^2-Net, Textcleaner and Tesseract

Toolset

U^2-Net is used for background removal
Textcleaner is used for image cleaning and line deskew (max 5 degrees)
Tesseract is used for text angle rotation
Deskew is used for line deskew (between 5 and 45 degrees)

Examples

Tested one document on smartphone camera with different angles

To build & deploy

Clone thee repo
Download the model: check app/saved_models/README.md
Build Docker image : docker build -t / : .
Test locally : Run Docker image and check if api is working by running http://localhost:10000
- CPU : docker run -it -v $PWD:/LOCAL/ -p 10000:80 / :
- GPU : docker run -it --gpus all -v $PWD:/LOCAL/ -p 10000:80 / :
Push docker image to Dockerhub (optional):
- Check: https://docs.docker.com/docker-hub/repos/ for account setup
- Create in Dockerhub Repo similar to the name of yout Image ID :
- Run docker push / :
Deploy to Cloud Run (optional):
- Create your google cloud account
- Push Docker Image to Google Container Registry
  - create new project called [PROJECT-ID]
  - Open Cloud shell in your Google account and run: docker pull / : docker tag [IMAGE] gcr.io/[PROJECT-ID]/[IMAGE] docker push gcr.io/[PROJECT-ID]/[IMAGE] more detail in this link
- Create CloudRun Service, and select Container that was created
  - Screenshot of the config - for demo purpose, it will be cost free
- Click Deploy, and test the Api Url that will display

Limits and Areas for improvements

Speed: It takes 7 to 10 seconds to process one image (serverless Cloud Run) With Gpu we can save 2 to 3 seconds (U^2-Net is 3 times faster)
Textcleaner is slow but works better on image cleaning, but needs some manual fine-tuning

References

U^2-Net https://github.com/xuebinqin/U-2-Net.git
Textcleaner http://www.fmwconcepts.com/imagemagick/textcleaner/
Tesseract https://github.com/tesseract-ocr/tesseract
Deskew https://github.com/sbrunner/deskew.git

You might also like...

make a better chinese character recognition OCR than tesseract

deep ocr See README_en.md for English installation documentation. 只在ubuntu下面测试通过，需要virtualenv安装，安装路径可自行调整： git clone https://github.com/JinpengLI/deep

1.5k Dec 28, 2022

A document scanner application for laptops/desktops developed using python, Tkinter and OpenCV.

DcoumentScanner A document scanner application for laptops/desktops developed using python, Tkinter and OpenCV. Directly install the .exe file to inst

1 Oct 29, 2021

An interactive document scanner built in Python using OpenCV

The scanner takes a poorly scanned image, finds the corners of the document, applies the perspective transformation to get a top-down view of the document, sharpens the image, and applies an adaptive color threshold to clean up the image.

1 Feb 12, 2022

ARU-Net - Deep Learning Chinese Word Segment

ARU-Net: A Neural Pixel Labeler for Layout Analysis of Historical Documents Contents Introduction Installation Demo Training Introduction This is the

128 Sep 12, 2022

CVPR 2021 Oral paper "LED2-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering" official PyTorch implementation.

LED2-Net This is PyTorch implementation of our CVPR 2021 Oral paper "LED2-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering". Y

83 Jan 4, 2023

This project proposes a camera vision based cursor control system, using hand moment captured from a webcam through a landmarks of hand by using Mideapipe module

2 Feb 20, 2022

This is a project to detect gestures to zoom in or out, using the real-time distance between the index finger and the thumb. It's based on OpenCV and Mediapipe.

Pinch-zoom This is a python project based on real-time hand-gesture detection, to zoom in or out, using the distance between the index finger and the

6 Jul 11, 2022

Image Detector and Convertor App created using python's Pillow, OpenCV, cvlib, numpy and streamlit packages.

11 Jan 2, 2022

Educational application aimed at automating user-defined workflows for the mobile game, "Granblue Fantasy", using a variety of CV technologies in the backend such as OpenCV, PyAutoGUI and EasyOCR and a frontend coded in Typescript.

Granblue Automation using Template Matching (It is like Full Auto, but with Full Customization!) Discord here: https://discord.gg/5Yv4kqjAbm Android v

71 Dec 30, 2022

Responsive Doc. scanner using U^2-Net, Textcleaner and Tesseract

Related tags

Overview

Responsive Doc. scanner using U^2-Net, Textcleaner and Tesseract

Toolset

Examples

To build & deploy

Limits and Areas for improvements

References

You might also like...

make a better chinese character recognition OCR than tesseract

A document scanner application for laptops/desktops developed using python, Tkinter and OpenCV.

An interactive document scanner built in Python using OpenCV

ARU-Net - Deep Learning Chinese Word Segment

CVPR 2021 Oral paper "LED2-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering" official PyTorch implementation.

This project proposes a camera vision based cursor control system, using hand moment captured from a webcam through a landmarks of hand by using Mideapipe module

This is a project to detect gestures to zoom in or out, using the real-time distance between the index finger and the thumb. It's based on OpenCV and Mediapipe.

Image Detector and Convertor App created using python's Pillow, OpenCV, cvlib, numpy and streamlit packages.

Educational application aimed at automating user-defined workflows for the mobile game, "Granblue Fantasy", using a variety of CV technologies in the backend such as OpenCV, PyAutoGUI and EasyOCR and a frontend coded in Typescript.

Owner

It is a image ocr tool using the Tesseract-OCR engine with the pytesseract package and has a GUI.

python ocr using tesseract/ with EAST opencv detector

Go package for OCR (Optical Character Recognition), by using Tesseract C++ library

A bot that extract text from images using the Tesseract OCR.

Indonesian ID Card OCR using tesseract OCR

This pyhton script converts a pdf to Image then using tesseract as OCR engine converts Image to Text

A Python wrapper for Google Tesseract

A Python wrapper for the tesseract-ocr API

Run tesseract with the tesserocr bindings with @OCR-D's interfaces

Tesseract Open Source OCR Engine (main repository)