An Optical Character Recognition system using Pytesseract/Extracting data from Blood Pressure Reports.

Overview

Optical_Character_Recognition

An Optical Character Recognition system using Pytesseract/Extracting data from Blood Pressure Reports.

As an IOT/Computer Visions Intern at the Graduate Rotational Internship program (GRIP) by The Sparks Foundation (TSF), the first task is to implement a character detector which extracts printed or handwritten text from an image/video.

For more learning purposes, I've utilized this feature in cleaning/extracting valuable information from Blood Pressure Reports as images.

download

Dependencies

  • tesseract-ocr package
  • pytesseract 0.3.8
  • Open-cv
  • Pandas

    Using the pytesseract open source library to detect text on image/video.

    Open-cv for Image Processing

    Pandas for data manipulation

  • You might also like...
    Comparison-of-OCR  (KerasOCR, PyTesseract,EasyOCR)
    Comparison-of-OCR (KerasOCR, PyTesseract,EasyOCR)

    Optical Character Recognition OCR (Optical Character Recognition) is a technology that enables the conversion of document types such as scanned paper

    Handwritten Number Recognition using CNN and Character Segmentation

    Handwritten-Number-Recognition-With-Image-Segmentation Info About this repository This Repository is aimed at reading handwritten images of numbers an

    make a better chinese character recognition OCR than tesseract
    make a better chinese character recognition OCR than tesseract

    deep ocr See README_en.md for English installation documentation. 只在ubuntu下面测试通过,需要virtualenv安装,安装路径可自行调整: git clone https://github.com/JinpengLI/deep

    Character Segmentation using TensorFlow
    Character Segmentation using TensorFlow

    Character Segmentation Segment characters and spaces in one text line,from this paper Chinese English mixed Character Segmentation as Semantic Segment

    A tool for extracting text from scanned documents (via OCR), with user-defined post-processing.

    The project is based on older versions of tesseract and other tools, and is now superseded by another project which allows for more granular control o

    A machine learning software for extracting information from scholarly documents

    GROBID GROBID documentation Visit the GROBID documentation for more detailed information. Summary GROBID (or Grobid, but not GroBid nor GroBiD) means

    Official implementation of Character Region Awareness for Text Detection (CRAFT)
    Official implementation of Character Region Awareness for Text Detection (CRAFT)

    CRAFT: Character-Region Awareness For Text detection Official Pytorch implementation of CRAFT text detector | Paper | Pretrained Model | Supplementary

    CRAFT-Pyotorch:Character Region Awareness for Text Detection Reimplementation for Pytorch
    CRAFT-Pyotorch:Character Region Awareness for Text Detection Reimplementation for Pytorch

    CRAFT-Reimplementation Note:If you have any problems, please comment. Or you can join us weChat group. The QR code will update in issues #49 . Reimple

    Sign Language Recognition service utilizing a deep learning model with Long Short-Term Memory to perform sign language recognition.
    Sign Language Recognition service utilizing a deep learning model with Long Short-Term Memory to perform sign language recognition.

    Sign Language Recognition Service This is a Sign Language Recognition service utilizing a deep learning model with Long Short-Term Memory to perform s

    Owner
    Ramsis Hammadi
    Ramsis Hammadi
    ISI's Optical Character Recognition (OCR) software for machine-print and handwriting data

    VistaOCR ISI's Optical Character Recognition (OCR) software for machine-print and handwriting data Publications "How to Efficiently Increase Resolutio

    ISI Center for Vision, Image, Speech, and Text Analytics 21 Dec 8, 2021
    Extract tables from scanned image PDFs using Optical Character Recognition.

    ocr-table This project aims to extract tables from scanned image PDFs using Optical Character Recognition. Install Requirements Tesseract OCR sudo apt

    Abhijeet Singh 209 Dec 6, 2022
    Go package for OCR (Optical Character Recognition), by using Tesseract C++ library

    gosseract OCR Golang OCR package, by using Tesseract C++ library. OCR Server Do you just want OCR server, or see the working example of this package?

    Hiromu OCHIAI 1.9k Dec 28, 2022
    Provides OCR (Optical Character Recognition) services through web applications

    OCR4all As suggested by the name one of the main goals of OCR4all is to allow basically any given user to independently perform OCR on a wide variety

    null 174 Dec 31, 2022
    A collection of resources (including the papers and datasets) of OCR (Optical Character Recognition).

    OCR Resources This repository contains a collection of resources (including the papers and datasets) of OCR (Optical Character Recognition). Contents

    Zuming Huang 363 Jan 3, 2023
    This is a GUI for scrapping PDFs with the help of optical character recognition making easier than ever to scrape PDFs.

    pdf-scraper-with-ocr With this tool I am aiming to facilitate the work of those who need to scrape PDFs either by hand or using tools that doesn't imp

    Jacobo José Guijarro Villalba 75 Oct 21, 2022
    Optical character recognition for Japanese text, with the main focus being Japanese manga

    Manga OCR Optical character recognition for Japanese text, with the main focus being Japanese manga. It uses a custom end-to-end model built with Tran

    Maciej Budyś 327 Jan 1, 2023
    A little but useful tool to explore OCR data extracted with `pytesseract` and `opencv`

    Screenshot OCR Tool Extracting data from screen time screenshots in iOS and Android. We are exploring 3 options: Simple OCR with no text position usin

    Gabriele Marini 1 Dec 7, 2021
    Programa que viabiliza a OCR (Optical Character Reading - leitura óptica de caracteres) de um PDF.

    Este programa tem o intuito de ser um modificador de arquivos PDF. Os arquivos PDFs podem ser 3: PDFs verdadeiros - em que podem ser selecionados o ti

    Daniel Soares Saldanha 2 Oct 11, 2021
    It is a image ocr tool using the Tesseract-OCR engine with the pytesseract package and has a GUI.

    OCR-Tool It is a image ocr tool made in Python using the Tesseract-OCR engine with the pytesseract package and has a GUI. This is my second ever pytho

    Khant Htet Aung 4 Jul 11, 2022