A simple OCR API server, seriously easy to be deployed by Docker, on Heroku as well

Overview

ocrserver

Go CI codecov Go Report Card

Simple OCR server, as a small working sample for gosseract.

Try now here https://ocr-example.herokuapp.com/, and deploy your own now.

Deploy to Heroku

# Get the code
% git clone [email protected]:otiai10/ocrserver.git
% cd ocrserver
# Make your app
% heroku login
% heroku create
# Deploy the container
% heroku container:login
% heroku container:push web
# Enjoy it!
% heroku open

cf. heroku cli

Quick Start

Ready-Made Docker Image

% docker run -p 8080:8080 otiai10/ocrserver
# open http://localhost:8080

cf. docker

Development with Docker Image

% docker-compose up
# open http://localhost:8080

You need more languages?

% docker-compose build --build-arg LOAD_LANG=rus
% docker-compose up

cf. docker-compose

Manual Setup

If you have tesseract-ocr and library files on your machine

% go get github.com/otiai10/ocrserver/...
% PORT=8080 ocrserver
# open http://localhost:8080

cf. gosseract

Documents

Comments
  • The OCR results on the local machine are not the same as your demo website.

    The OCR results on the local machine are not the same as your demo website.

    I try to read the image on my local machine, but the result is bad: image But it is perfect on your demo website: image I don't know why, did you use any trained model for that?

    opened by leowilbur 6
  • Any plan to use gosseract develop branch for saving init cost?

    Any plan to use gosseract develop branch for saving init cost?

    I found that this client's init method cost a lot of time while using ocrserver. Any plan to use gosseract develop branch for saving init cost, or some suggestions for me?

    opened by wangsongyan 3
  • fail to run main.go

    fail to run main.go

    Hello

    i get the following error when running the code:

    github.com/otiai10/gosseract/v2

    cc1.exe: sorry, unimplemented: 64-bit mode not compiled in

    #reprdocuceable every time

    go env
    

    set GO111MODULE= set GOARCH=amd64 set GOBIN= set GOCACHE=C:\Users\BlackPearl\AppData\Local\go-build set GOENV=C:\Users\BlackPearl\AppData\Roaming\go\env set GOEXE=.exe set GOFLAGS= set GOHOSTARCH=amd64 set GOHOSTOS=windows set GOINSECURE= set GOMODCACHE=C:\Users\BlackPearl\go\pkg\mod set GONOPROXY= set GONOSUMDB= set GOOS=windows set GOPATH=C:\go set GOPRIVATE= set GOPROXY=https://proxy.golang.org,direct set GOROOT=c:\go set GOSUMDB=sum.golang.org set GOTMPDIR= set GOTOOLDIR=c:\go\pkg\tool\windows_amd64 set GCCGO=gccgo set AR=ar set CC=gcc set CXX=g++ set CGO_ENABLED=1 set GOMOD= set CGO_CFLAGS=-g -O2 set CGO_CPPFLAGS= set CGO_CXXFLAGS=-g -O2 set CGO_FFLAGS=-g -O2 set CGO_LDFLAGS=-g -O2 set PKG_CONFIG=pkg-config set GOGCCFLAGS=-m64 -mthreads -fmessage-length=0 -fdebug-prefix-map=C:\Users\BLACKP~1\AppData\Local\Temp\go-build118439850=/tmp/go-build -gno-record-gcc-switches

    go version
    

    go version go1.15.6 windows/amd64

    tesseract --version
    

    tesseract v4.0.0.20190314 leptonica-1.78.0 libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.5.3) : libpng 1.6.34 : libtiff 4.0.9 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.2.0 Found AVX2 Found AVX Found SSE

    no reply 
    opened by damn-at 2
  • docker  quick start ,whilelist isn't working!

    docker quick start ,whilelist isn't working!

    docker quick start ,whilelist isn't working! this simple "https://ocr-example.herokuapp.com/" is well. from docker pull image. do not working

    bug 
    opened by wuxue107 2
  • stdlib.h: No such file or directory

    stdlib.h: No such file or directory

    I have a Ubuntu Server VM as I hate Docker, the machine is clean and created with the following commands:

    #Install go rm -r /temp &&
    mkdir /temp &&
    cd /temp &&
    wget https://dl.google.com/go/go1.13.7.linux-amd64.tar.gz &&
    tar -xvf go1.13.7.linux-amd64.tar.gz &&
    mv go /usr/local &&
    cd / &&
    rm -r /temp &&
    export GOROOT=/usr/local/go &&
    export PATH=$GOPATH/bin:$GOROOT/bin:$PATH

    #Install libs #2 - Lib installation apt-get install -y
    tesseract-ocr-all
    libtesseract-dev
    imagemagick
    libleptonica-dev
    gcc

    So when: go get github.com/otiai10/ocrserver/...

    The following error is thrown: image

    PD:

    1. Tesseract does work individually as I can use the command line to interact with it
    2. Go does work properly as I tested an hello world from git
    opened by Zarovzky 2
  • Setting language via http.request

    Setting language via http.request

    Hi,

    I think your project is great and I like the Docker solution very much. however, I just can't get it to change the language in the header at the moment.

    Would you have a tip for me how I could change this?

    I already tried with Header.Add and writer.CreateFormField("languages") without success.

    Thanks a lot

    `file, _ := os.Open(datei) defer file.Close()

    body := new(bytes.Buffer)
    writer := multipart.NewWriter(body)
    part2, err := writer.CreateFormField("languages")
    part, _ := writer.CreateFormFile("file", filepath.Base(file.Name()))
    
    if err != nil {
    	fmt.Printf("Error creatformfield: %s", err)
    }
    
    part2.Write([]byte("deu"))
    io.Copy(part, file)
    writer.Close()
    
    r, _ := http.NewRequest("POST", "http://10.6.0.3:9080/file", body)
    //r.Header.Add("languages", "deu")
    //r.Header.Set("languages", "deu")
    r.Header.Add("Content-Type", writer.FormDataContentType())
    
    opened by schabil 1
  • CORS not working

    CORS not working

    I deployed this container to Heroku and sending POST requests via curl works great! However, I have trouble getting POST requests via a React web app due to CORS. What is the best way to enable CORS on this project, since it implements the request endpoints using marmoset? Thank you!

    opened by emersonhsieh 1
  • load languages option in Dockerfile should just be tesseract-ocr-all

    load languages option in Dockerfile should just be tesseract-ocr-all

    It would be more useful for the demonstrator docker-compose up version to just install all languages (tesseract-ocr-all) instead of tessseract-ocr-jpn. If anything, it should be included in the readme how to change the Dockerfile to install all languages.

    opened by james-see 1
  • want to get the version of tesseract-ocr

    want to get the version of tesseract-ocr

    when i run this project on my windows computer, i got a result that be diffrent from to your site(https://ocr-example.herokuapp.com/), i want to find the reason, maybe is tesseract-ocr version or traindata

    0_0 QQ截图20190508160952 QQ截图20190508161024

    opened by wangsongyan 1
  • Optimize Dokcerfile

    Optimize Dokcerfile

    Optimize Dockerfile, modify goproxy, LOAD_LANG, and run build file on scratch. Old image size is about 800+M, optimize image size is about 9M, and it doesn't have bash , only used as a web service.

    opened by RocsSun 0
  • Optimize Dockerfile

    Optimize Dockerfile

    Optimize Dockerfile.

    Optimize Dockerfile. Old dockerfile build image size is about 800+M,this docker file build image size is about 200M, if use alpine base image, build image size is about 160M, and test pass. dockerhub repositories at redsun/ocr-server.

    opened by RocsSun 1
Owner
Hiromu OCHIAI
🙋 ❤️ 🍣
Hiromu OCHIAI
It is a image ocr tool using the Tesseract-OCR engine with the pytesseract package and has a GUI.

OCR-Tool It is a image ocr tool made in Python using the Tesseract-OCR engine with the pytesseract package and has a GUI. This is my second ever pytho

Khant Htet Aung 4 Jul 11, 2022
Indonesian ID Card OCR using tesseract OCR

KTP OCR Indonesian ID Card OCR using tesseract OCR KTP OCR is python-flask with tesseract web application to convert Indonesian ID Card to text / JSON

Revan Muhammad Dafa 5 Dec 6, 2021
A Python wrapper for the tesseract-ocr API

tesserocr A simple, Pillow-friendly, wrapper around the tesseract-ocr API for Optical Character Recognition (OCR). tesserocr integrates directly with

Fayez 1.7k Dec 31, 2022
FastOCR is a desktop application for OCR API.

FastOCR FastOCR is a desktop application for OCR API. Installation Arch Linux fastocr-git @ AUR Build from AUR or install with your favorite AUR helpe

Bruce Zhang 58 Jan 7, 2023
Use Youdao OCR API to covert your clipboard image to text.

Alfred Clipboard OCR 注:本仓库基于 oott123/alfred-clipboard-ocr 的逻辑用 Python 重写,换用了有道 AI 的 API,准确率更高,有效防止百度导致隐私泄露等问题,并且有道 AI 初始提供的 50 元体验金对于其资费而言个人用户基本可以永久使用

Junlin Liu 6 Sep 19, 2022
"Very simple but works well" Computer Vision based ID verification solution provided by LibraX.

ID Verification by LibraX.ai This is the first free Identity verification in the market. LibraX.ai is an identity verification platform for developers

LibraX.ai 46 Dec 6, 2022
Tracking the latest progress in Scene Text Detection and Recognition: Must-read papers well organized

SceneTextPapers Tracking the latest progress in Scene Text Detection and Recognition: must-read papers well organized Information about this repositor

Shangbang Long 763 Jan 1, 2023
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

EasyOCR Ready-to-use OCR with 80+ languages supported including Chinese, Japanese, Korean and Thai. What's new 1 February 2021 - Version 1.2.3 Add set

Jaided AI 16.7k Jan 3, 2023
OCR-D-compliant page segmentation

ocrd_segment This repository aims to provide a number of OCR-D-compliant processors for layout analysis and evaluation. Installation In your virtual e

OCR-D 59 Sep 10, 2022
OCR software for recognition of handwritten text

Handwriting OCR The project tries to create software for recognition of a handwritten text from photos (also for Czech language). It uses computer vis

Břetislav Hájek 562 Jan 3, 2023
Turn images of tables into CSV data. Detect tables from images and run OCR on the cells.

Table of Contents Overview Requirements Demo Modules Overview This python package contains modules to help with finding and extracting tabular data fr

Eric Ihli 311 Dec 24, 2022
Code for the paper STN-OCR: A single Neural Network for Text Detection and Text Recognition

STN-OCR: A single Neural Network for Text Detection and Text Recognition This repository contains the code for the paper: STN-OCR: A single Neural Net

Christian Bartz 496 Jan 5, 2023
A pure pytorch implemented ocr project including text detection and recognition

ocr.pytorch A pure pytorch implemented ocr project. Text detection is based CTPN and text recognition is based CRNN. More detection and recognition me

coura 444 Dec 30, 2022
python ocr using tesseract/ with EAST opencv detector

pytextractor python ocr using tesseract/ with EAST opencv text detector Uses the EAST opencv detector defined here with pytesseract to extract text(de

Danny Crasto 38 Dec 5, 2022
Run tesseract with the tesserocr bindings with @OCR-D's interfaces

ocrd_tesserocr Crop, deskew, segment into regions / tables / lines / words, or recognize with tesserocr Introduction This package offers OCR-D complia

OCR-D 38 Oct 14, 2022
A set of workflows for corpus building through OCR, post-correction and normalisation

PICCL: Philosophical Integrator of Computational and Corpus Libraries PICCL offers a workflow for corpus building and builds on a variety of tools. Th

Language Machines 41 Dec 27, 2022
Tensorflow-based CNN+LSTM trained with CTC-loss for OCR

Overview This collection demonstrates how to construct and train a deep, bidirectional stacked LSTM using CNN features as input with CTC loss to perfo

Jerod Weinman 489 Dec 21, 2022
🖺 OCR using tensorflow with attention

tensorflow-ocr ?? OCR using tensorflow with attention, batteries included Installation git clone --recursive http://github.com/pannous/tensorflow-ocr

null 646 Nov 11, 2022
This is the implementation of the paper "Gated Recurrent Convolution Neural Network for OCR"

Gated Recurrent Convolution Neural Network for OCR This project is an implementation of the GRCNN for OCR. For details, please refer to the paper: htt

null 90 Dec 22, 2022